情报研究

基于多模态特征融合的相似专利识别方法研究

  • 谢小东 ,
  • 吴洁 ,
  • 盛永祥 ,
  • 王建刚 ,
  • 周潇
展开
  • 江苏科技大学经济管理学院 镇江 212003
谢小东,博士研究生,思路设计,实施实验,论文撰写;盛永祥,教授,博士,硕士生导师,论文修改;王建刚,讲师,博士,硕士生导师,论文修改;周潇,讲师,博士,论文修改

收稿日期: 2024-01-15

  修回日期: 2024-04-12

  网络出版日期: 2024-10-08

基金资助

本文系国家自然科学基金面上项目“面向产业安全的产业创新生态系统韧性内涵、评价与优化策略研究”(项目编号:72171122)和江苏省研究生科研与实践创新计划项目“创新联合体潜在合作伙伴选择及合作方向研究”(项目编号:KYCX23_3817)研究成果之一。

Research on Similar Patent Identification Based on Multimodal Feature Fusion

  • Xie Xiaodong ,
  • Wu Jie ,
  • Sheng Yongxiang ,
  • Wang Jiangang ,
  • Zhou Xiao
Expand
  • School of Economics and Management, Jiangsu University of Science and Technology, Zhenjiang 212003

Received date: 2024-01-15

  Revised date: 2024-04-12

  Online published: 2024-10-08

Supported by

This work is supported by the general program of National Natural Science Foundation of China titled “Resilience of Industrial Innovation Ecosystems for Industrial Safety: Implications, Assessment, and Optimization Strategies” (Grant No. 72171122), and Postgraduate Research and Practice Innovation Program of Jiangsu Province titled “Research on the Selection of Potential Partners and Collaboration Directions for Innovation Consortia” (Grant No. KYCX23_3817).

摘要

[目的/意义] 专利数量攀升的同时给专利检索工作带来了巨大的挑战,如何利用先进的计算机技术进行相似专利识别成为亟待解决的问题。[方法/过程] 提出一种基于多模态特征融合的相似专利识别方法,通过BERT-wwm模型和ResNet-50模型提取专利文本模态特征和图像模态特征,结合自注意力机制和交叉注意力机制有效利用两种模态内部特征信息以及模态间的交互信息,在此基础上通过模型训练与优化进行相似专利识别。[结果/结论] 采用IPC为“C08F10/00”领域数据进行实证,本文模型准确率达到80.03%,召回率达到82.01%,优于基线模型效果。进行相似专利识别模拟实验,本文模型召回率达到88.89%,实际应用效果较为优异。文本模态特征和图像模态特征结合可以有效提高相似专利识别准确率和效率,本文方法有助于提高专利检索效率,加快专利审查过程,辅助专利预警分析,加强知识产权的保护。

本文引用格式

谢小东 , 吴洁 , 盛永祥 , 王建刚 , 周潇 . 基于多模态特征融合的相似专利识别方法研究[J]. 图书情报工作, 2024 , 68(18) : 112 -122 . DOI: 10.13266/j.issn.0252-3116.2024.18.011

Abstract

[Purpose/Significance] The burgeoning number of patents poses significant challenges to patent retrieval, highlighting the urgent need for advanced computational techniques to identify similar patents. [Method/Process] This paper proposed a multimodal feature fusion method for similar patent identification. It utilized the BERT-wwm model and the ResNet-50 model to extract textual and image features of patents, respectively. By integrating self-attention and cross-attention mechanisms, the method effectively harnessed intra-modal feature information and inter-modal interaction information. Based on these, the model was trained and optimized for the similar patent identification. [Result/Conclusion] Empirical tests using IPC category “C08F10/00” data demonstrate that the model achieves an accuracy of 80.03% and a recall rate of 82.01%, outperforming baseline models. In simulations of similar patent identification, the model reaches a recall rate of 88.89%, indicating superior practical performance. The fusion of textual and image modal features significantly enhances the accuracy and efficiency of similar patent identification. This approach facilitates improved patent retrieval efficiency, accelerates the patent examination process, aids in patent alert analysis, and strengthens intellectual property protection.

参考文献

[1] 王曰芬, 谢寿峰, 邱玉婷. 面向预警的专利文献相似度研究的意义及现状 [J]. 情报理论与实践, 2014, 37(7): 135-140. (WANG Y F, XIE S F, QIU Y T. Significance and current status of the research on patent similarity computation for early warning [J]. Information studies: theory & application, 2014, 37(7): 135-140.)
[2] 赖院根, 朱东华. 专利预警警情的理论研究 [J]. 科学学与科学技术管理, 2009, 30(2): 5-9. (LAI Y G, ZHU D H. Theoretical research on alert-situation of patent early-warning system [J]. Science of science and management of S.&.T., 2009, 30(2): 5-9.)
[3] KIM S, YOON B. Patent infringement analysis using a text mining technique based on SAO structure [J]. Computers in industry, 2021, 125: 103379.
[4] PARK I, YOON B. A semantic analysis approach for identifying patent infringement based on a product–patent map [J]. Technology analysis & strategic management, 2014, 26(8): 855-874.
[5] ZHU D. Bibliometric analysis of patent infringement retrieval model based on self-organizing map neural network algorithm [J]. Library hi tech, 2020, 38(2): 479-491.
[6] LEE C, SONG B, PARK Y. How to assess patent infringement risks: a semantic patent claim analysis using dependency relationships [J]. Technology analysis & strategic management, 2013, 25(1): 23-38.
[7] CHEN L, XU S, ZHU L, et al. A deep learning based method for extracting semantic information from patent documents [J]. Scientometrics, 2020, 125(1): 289-312.
[8] LIN W, YU W, XIAO R. Measuring patent similarity based on text mining and image recognition [J]. Systems, 2023, 11(6): 294.
[9] 张杰, 孙宁宁, 张海超, 等. 基于SAO结构的中文相似专利识别算法及其应用 [J]. 情报学报, 2016, 35(5): 472-482. (ZHANG J, SUN N N, ZHANG H C, et al. Method and application of Chinese similar patents recognition based on SAO structures [J]. Journal of the China Society for Scientific and Technical Information, 2016, 35(5): 472-482.)
[10] WANG X, REN H, CHEN Y, et al. Measuring patent similarity with SAO semantic analysis [J]. Scientometrics, 2019, 121(1): 1-23.
[11] LEE S, YOON B, PARK Y. An approach to discovering new technology opportunities: Keyword-based patent map approach [J]. Technovation, 2009, 29(6-7): 481-497.
[12] LEE C, CHO Y, SEOL H, et al. A stochastic patent citation analysis approach to assessing future technological impacts [J]. Technological forecasting and social change, 2012, 79(1): 16-29.
[13] 吕学强, 罗艺雄, 李家全, 等. 中文专利侵权检测研究综述 [J]. 数据分析与知识发现, 2021, 5(3): 60-68. (LV X Q, LUO Y X, LI J Q, et al. Review of studies on detecting Chinese patent infringements [J]. Data analysis and knowledge discovery, 2021, 5(3): 60-68.)
[14] 马文姗, 赵海宁, 翟东升. 中文专利侵权检索模型研究 [J]. 情报杂志, 2012, 31(4): 175-179,195. (MA W S, ZHAO H N, ZHAI D S. Research on Chinese patent infringement retrieval model [J]. Journal of intelligence, 2012, 31(4): 175-179,195.)
[15] 彭继东, 谭宗颖. 一种基于文本挖掘的专利相似度测量方法及其应用 [J]. 情报理论与实践, 2010, 33(12): 114-118. (PENG J D, TAN Z Y. A text mining-based patent similarity measurement method and its application [J]. Information studies: theory & application, 2010, 33(12): 114-118.)
[16] 俞琰, 陈磊, 姜金德, 等. 结合词向量和统计特征的专利相似度测量方法 [J]. 数据分析与知识发现, 2019, 3(9): 53-59. (YU Y, CHEN L, JIANG J D, et al. Measuring patent similarity with word embedding and statistical features [J]. Data analysis and knowledge discovery, 2019, 3(9): 53-59.)
[17] YOON B U, YOON C B, PARK Y T. On the development and application of a self–organizing feature map–based patent map [J]. R&D Management, 2002, 32(4): 291-300.
[18] 曹祺, 赵伟, 张英杰, 等. 基于Doc2Vec的专利文件相似度检测方法的对比研究 [J]. 图书情报工作, 2018, 62(13): 74-81. (CAO Q, ZHAO W, ZHANG Y J, et al. Comparative study of patent documents similarity detection on deep learning of Doc2Vec based methods [J]. Library and information service, 2018, 62(13): 74-81.)
[19] BALTRUŠAITIS T, AHUJA C, MORENCY L P. Multimodal machine learning: a survey and taxonomy [J]. IEEE transactions on pattern analysis and machine intelligence, 2019, 41(2): 423-443.
[20] ATREY P K, HOSSAIN M A, EL SADDIK A, et al. Multimodal fusion for multimedia analysis: a survey [J]. Multimedia systems, 2010, 16(6): 345-379.
[21] JABEEN S, LI X, AMIN M S, et al. A review on methods and applications in multimodal deep learning [J]. ACM trans. multimed comput. commun. appl, 2023, 19(2s): 1-41.
[22] CAI G, XIA B. Convolutional neural networks for multimedia sentiment analysis; proceedings of the natural language processing and Chinese computing[C]//Natural language processing and chinese computing, Springer, Cham, 2015: 159-167.
[23] HUANG F, ZHANG X, ZHAO Z, et al. Image–text sentiment analysis via deep multimodal attentive fusion [J]. Knowledge-based systems, 2019, 167: 26-37.
[24] CHEN J, WANG C, WANG K, et al. HEU Emotion: a large-scale database for multimodal emotion recognition in the wild [J]. Neural computing and applications, 2021, 33(14): 8669-8685.
[25] HUANG Y, YANG X, GAO J, et al. Knowledge-driven egocentric multimodal activity recognition [J]. ACM trans. multimed comput. commun. appl, 2020, 16(4): 1-133.
[26] DANG-NGUYEN D-T, PIRAS L, GIACINTO G, et al. Multimodal retrieval with diversification and relevance feedback for tourist attraction images [J]. ACM trans. multimed comput. commun. appl, 2017, 13(4): 1-24.
[27] SENGUPTA S, BASAK S, SAIKIA P, et al. A review of deep learning with special emphasis on architectures, applications and recent trends [J]. Knowledge-based systems, 2020, 194: 105596.
[28] JIANG S, HU J, MAGEE C L, et al. Deep learning for technical document classification [J]. IEEE transactions on engineering management, 2024, 71: 1163-1179.
[29] FANG L, ZHANG L, WU H, et al. Patent2Vec: Multi-view representation learning on patent-graphs for patent classification [J]. World wide web, 2021, 24(5): 1791-1812.
[30] 李晴晴, 周长胜, 吕学强, 等. 基于外观设计专利的多模态图像检索 [J]. 计算机工程与设计, 2016, 37(9): 2469-2474. (LI Q Q, ZHOU C S, LV X Q, et al. Multi-moda limage retrieval based on design patent image [J]. Computer engineering and design, 2016, 37(9): 2469-2474.)
[31] PUSTU-IREN K, BRUNS G, EWERTH R. A multimodal approach for semantic patent image retrieval [C]// Proc. 2nd workshop patent text mining semantic technol. co-located 44th Int. ACM SIGIR Conf. Res. Develop. Inf. Retr. (SIGIR), 2021: 45-49.
[32] WEI X, ZHANG T, LI Y, et al. Multi-modality cross attention network for image and sentence matching [C]// Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. Seattle: IEEE, 2020: 10941-10950.
[33] 于家畦, 康晓东, 白程程, 等. 一种新的中文电子病历文本检索模型 [J]. 计算机科学, 2022, 49(S1): 32-38. (YU J Q, KANG X D, BAI C C, et al. New text retrieval model of chinese electronic medical records [J]. Computer science, 2022, 49(S1): 32-38.)
[34] CUI Y, CHE W, LIU T, et al. Pre-training with whole word masking for Chinese BERT [J]. IEEE/ACM transactions on audio, speech, and language processing, 2021, 29: 3504-3514.
[35] THECKEDATH D, SEDAMKAR R. Detecting affect states using VGG16, ResNet50 and SE-ResNet50 networks [J]. SN computer science, 2020, 1: 1-7.
[36] CHU Y, YUE X, YU L, et al. Automatic image captioning based on ResNet50 and LSTM with soft attention [J]. Wireless communications and mobile computing, 2020, 2020: 1-7.
[37] 吕学强, 田驰, 张乐, 等. 融合多特征和注意力机制的多模态情感分析模型 [J]. 数据分析与知识发现, 2024, 8(5): 91-101. (LV X Q, TIAN C, ZHANG L, et al. Multi-modal emotion analysis model integrating multi-features and attention mechanism [J]. Data analysis and knowledge discovery, 2024, 8(5): 91-101.)
[38] LIN H, CHENG X, WU X, et al. CAT: cross attention in vision transformer[C]// Proceedings of the 2022 IEEE international conference on multimedia and expo. Taipei: IEEE, 2022, 1-6
文章导航

/