图书情报工作 ›› 2022, Vol. 66 ›› Issue (4): 118-128.DOI: 10.13266/j.issn.0252-3116.2022.04.012

• 情报研究 • 上一篇    下一篇

基于表示学习的技术融合差异度测度方法及其效果研究

吕璐成1,2, 赵亚娟1,2, 王学昭1,2, 韩涛1,2, 赵萍1, 张迪1   

  1. 1. 中国科学院文献情报中心 北京 100190;
    2. 中国科学院大学经济与管理学院图书情报与档案管理系 北京 100190
  • 收稿日期:2021-07-09 修回日期:2021-09-13 出版日期:2022-02-20 发布日期:2022-03-01
  • 通讯作者: 赵亚娟,研究员,博士,博士生导师,通信作者,E-mail:zhaoyj@mail.las.ac.cn
  • 作者简介:吕璐成,助理研究员,博士研究生;王学昭,研究员,博士,硕士生导师;韩涛,研究员,博士,硕士生导师;赵萍,副研究员,硕士;张迪,副研究员,博士。
  • 基金资助:
    本文系中国科学院战略研究专项"支撑我国重点产业发展的基础研究布局与关键技术储备研究"(项目编号:GHJ-ZLZX-2020-31-5)研究成果之一。

Research on the Measurement Method and Effect of Technology Convergence Disparity Based on Representation Learning

Lyu Lucheng1,2, Zhao Yajuan1,2, Wang Xuezhao1,2, Han Tao1,2, Zhao Ping1, Zhang Di1   

  1. 1. National Science Library, Chinese Academy of Sciences, Beijing 100190;
    2. Department of Library, Information and Archives Management, School of Economics and Management, University of Chinese Academy of Sciences, Beijing 100190
  • Received:2021-07-09 Revised:2021-09-13 Online:2022-02-20 Published:2022-03-01

摘要: [目的/意义] 现有研究进行技术融合差异度测度时仅在分类号层面开展、尚未涉及到分类号背后的技术语义内涵层面,且未对测度方法的效果进行对比,对此,本研究从揭示技术语义的角度进行技术融合差异度测度方法研究和效果比较研究,助力其方法论的完善。[方法/过程] 表示学习技术能够利用海量先验知识计算研究对象的语义差异,因此,提出基于Word2vec和Bert的技术融合差异度测度方法,可以利用专利分类号释义文本和关联专利文本来度量技术融合的差异度,共形成6种测度方法。采用这6种测度方法对2019-2020年申请的四方专利进行技术融合差异度的测度,与现有基于分类号共现频次和共现关系的差异度测度方法进行效果对比。[结果/结论] 研究发现,同时利用专利分类号释义文本和关联专利文本,采用Word2vec进行MC分类号向量化,较之其他方案能够更为有效地测算技术融合差异度,可以在未来技术融合的研究工作中推广应用。

关键词: 差异度, 技术融合, 技术会聚, 表示学习, BERT, Word2vec

Abstract: [Purpose/significance] When measuring the disparity of technology convergence, the existing studies only measure at the level of classification number, have not gone deep into the level of technical semantic connotation behind the classification number, and do not compare the effects of measurement methods. Therefore, this paper carries out the comparative research of methods and effects of technology convergence measurement from the perspective of revealing technology semantic, so as to help improve the methodology.[Method/process] Representation learning technology could take advantage of a large amount of prior knowledge to calculate the semantic differences of research objects. Therefore, this paper proposed a method to measure the disparity of technology convergence based on Word2vec and Bert, which could measure the disparity of technology convergence by using the interpretation text of patent classification number and the associated patent text. This study used these six measurement methods to measure the disparity of technology convergence of quadrilateral patents applied from 2019 to 2020, and compared with the existing disparity measurement methods based on the co-occurrence frequency and co-occurrence relationship of classification number.[Result/conclusion] This paper finds, by using the interpretation text of patent classification number and associated patent text at the same time, the MC classification number vectorization by using word2vec can more effectively measure the disparity of technology convergence than other schemes, which can be applied in the future research of technology convergence.

Key words: disparity, technology convergence, technology fusion, representation learning, BERT, Word2vec

中图分类号: