图书情报工作 ›› 2023, Vol. 67 ›› Issue (3): 49-60.DOI: 10.13266/j.issn.0252-3116.2023.03.005

• 情报研究 • 上一篇    下一篇

融合引用和文本特征的技术创新路径识别研究

岳丽欣1, 刘自强1, 刘春江2,3, 方曙2,3   

  1. 1 南京师范大学新闻与传播学院 南京 210023;
    2 中国科学院成都文献情报中心 成都 610041;
    3 中国科学院大学经济与管理学院信息资源管理系 北京 100190
  • 收稿日期:2022-08-09 修回日期:2022-11-12 出版日期:2023-02-24 发布日期:2023-02-24
  • 通讯作者: 刘自强,讲师,博士,通信作者, E-mail:lzq@njnu.edu.cn
  • 作者简介:岳丽欣,讲师,博士;刘春江,副研究馆员,博士;方曙,研究员,博士生导师。
  • 基金资助:
    本文系国家社会科学基金项目“专利技术创新风险识别与技术创新路径预测方法研究”(项目编号: 19BTQ088)研究成果之一。

Research on Technology Innovation Path Recognition Integrating Citation and Text Features

Yue Lixin1, Liu Ziqiang1, Liu Chunjiang2,3, Fang Shu2,3   

  1. 1 School of Journalism and Communication, Nanjing Normal University, Nanjing 210023;
    2 Chengdu Library of Chinese Academy of Sciences, Chengdu 610041;
    3 Department of Information Resources Management, School of Economics and Management, University of Chinese Academy of Sciences, Beijing 100190
  • Received:2022-08-09 Revised:2022-11-12 Online:2023-02-24 Published:2023-02-24

摘要: [目的/意义] 探索融合引用和文本特征的专利技术创新路径识别分析方法,有助于规避技术创新风险、优化选择技术创新路径,对提升创新主体的创新能力,促进现代产业发展,布局科技前沿发展战略等具有重要的意义。[方法/过程] 首先基于 Node2Vec 模型和 Doc2Vec 模型将专利引用和文本数据表示学习为可计算的高维向量;然后利用 LDA 主题模型进行技术主题识别并结合 T-SNE 算法降维,添加时间维度构建初始技术创新路径;最后,在专利引用和文本特征向量表示结果基础上,开展向量融合拼接从而实现融合引用和文本特征的技术创新路径识别。[结果/结论] 通过对超级电容器领域的实证,验证提出的融合引用和文本特征的的技术创新路径识别方法能够从特定领域专利文献中高效、准确地识别专利技术创新路径,证明方法的可行性和有效性。

关键词: 嵌入, 主题模型, 引用关系, 创新路径

Abstract: [Purpose/Significance] Exploring the identification and analysis method of patent technology innovation path integrating citation and text features is helpful to avoid the risk of technology innovation, optimize the selection of technology innovation path, and is of great significance to improve the innovation ability of innovation subjects, promote the development of modern industry, and lay out the development strategy of science and technology frontier. [Method/Process] Firstly, based on node2vec model and doc2vec model, the patent reference and text data representation were learned as computable high-dimensional vectors, and then the LDA topic model was used to identify the technical topic and the t-sne algorithm was used to reduce the dimension, and the time dimension was added to build the initial technical innovation path. Finally, vector stitching and fusion were carried out to realize the technology innovation path recognition integrating references and text features based on the vector representation results of patent reference and text features. [Result/Conclusion] Through the demonstration in the field of supercapacitors, it is verified that the technology innovation path identification method proposed in this paper, which integrates citation and text features, can identify the patent technology innovation path efficiently and accurately from the patent documents in specific fields, and the feasibility and effectiveness of the method proposed in this paper are verified.

Key words: embedding, topic model, reference relationship, innovation path

中图分类号: