[目的/意义] 现有研究进行技术融合差异度测度时仅在分类号层面开展、尚未涉及到分类号背后的技术语义内涵层面,且未对测度方法的效果进行对比,对此,本研究从揭示技术语义的角度进行技术融合差异度测度方法研究和效果比较研究,助力其方法论的完善。[方法/过程] 表示学习技术能够利用海量先验知识计算研究对象的语义差异,因此,提出基于Word2vec和Bert的技术融合差异度测度方法,可以利用专利分类号释义文本和关联专利文本来度量技术融合的差异度,共形成6种测度方法。采用这6种测度方法对2019-2020年申请的四方专利进行技术融合差异度的测度,与现有基于分类号共现频次和共现关系的差异度测度方法进行效果对比。[结果/结论] 研究发现,同时利用专利分类号释义文本和关联专利文本,采用Word2vec进行MC分类号向量化,较之其他方案能够更为有效地测算技术融合差异度,可以在未来技术融合的研究工作中推广应用。
[Purpose/significance] When measuring the disparity of technology convergence, the existing studies only measure at the level of classification number, have not gone deep into the level of technical semantic connotation behind the classification number, and do not compare the effects of measurement methods. Therefore, this paper carries out the comparative research of methods and effects of technology convergence measurement from the perspective of revealing technology semantic, so as to help improve the methodology.[Method/process] Representation learning technology could take advantage of a large amount of prior knowledge to calculate the semantic differences of research objects. Therefore, this paper proposed a method to measure the disparity of technology convergence based on Word2vec and Bert, which could measure the disparity of technology convergence by using the interpretation text of patent classification number and the associated patent text. This study used these six measurement methods to measure the disparity of technology convergence of quadrilateral patents applied from 2019 to 2020, and compared with the existing disparity measurement methods based on the co-occurrence frequency and co-occurrence relationship of classification number.[Result/conclusion] This paper finds, by using the interpretation text of patent classification number and associated patent text at the same time, the MC classification number vectorization by using word2vec can more effectively measure the disparity of technology convergence than other schemes, which can be applied in the future research of technology convergence.
[1] 中国科学报.《2020研究前沿》报告发布[EB/OL].[2020-11-18]. http://news.sciencenet.cn/htmlnews/2020/11/448506.shtm.
[2] 吕璐成, 赵亚娟. 基于专利数据的技术融合研究综述[J]. 图书情报工作, 2021, 65(6):138-148.
[3] 翟东升,张京先.基于专利技术共现网络的无人驾驶汽车技术融合演化研究[J].情报杂志,2020,39(4):60-66,19.
[4] 吴晓燕,胡雅敏,陈方.基于专利共类的技术融合分析框架研究——以合成生物学领域为例[J/OL].情报理论与实践:1-11[2021-05-29].http://kns.cnki.net/kcms/detail/11.1762.g3.20210524.1727.006.html.
[5] 张琳,彭玉杰,杜会英,等.技术会聚:内涵、现状与测度——兼论与学科交叉的关系[J].图书情报工作,2021,65(1):91-101.
[6] 李姝影,方曙.测度技术融合与趋势的数据分析方法研究进展[J].数据分析与知识发现,2017,1(7):2-12.
[7] 张琳,黄颖. 交叉科学:测度、评价与应用[M]. 北京:科学出版社, 2019.
[8] 阿瑟. 技术的本质[M]. 杭州:浙江人民出版社, 2014.
[9] 布什.科学:没有止境的前沿[M]//范岱年,解道华,译.北京:商务印书馆,2004.
[10] 冯科,曾德明.技术融合距离的聚类特征与影响因素——基于大规模专利数据的实证研究[J].管理评论,2019,31(8):97-109.
[11] ZHANG L, ROUSSEAU R, GLANZEL W. Diversity of references as an indicator of the interdisciplinarity of journals:taking similarity between subject fields into account[J]. Journal of the association for information science & technology,2016, 67(5):1257-1265.
[12] 韩正琪,刘小平,徐涵.基于Rao-Stirling指数的学科交叉文献发现——以纳米科学与纳米技术为例[J].图书情报工作,2018,62(1):125-131.
[13] 唐继瑞. 分类共现跨学科性的测度探讨与实证检验[D].南京:南京大学,2017.
[14] BU Y, LI M, GU W, et al. Topic diversity:A discipline scheme-free diversity measurement for journals[J]. Journal of the association for information science and technology,2021,72(5):523-539.
[15] 娄岩,杨嘉林,黄鲁成,等.基于专利共类的技术融合分析框架研究——以老年福祉技术与信息技术的融合为例[J].现代情报,2019,39(9):41-53.
[16] 吕一博,韦明,林歌歌.基于专利计量的技术融合研究:判定、现状与趋势——以物联网与人工智能领域为例[J].科学学与科学技术管理,2019,40(4):16-31.
[17] CAVIGGIOLI F. Technology fusion:Identification and analysis of the drivers of technology convergence using patent data[J]. Technovation, 2016, 55-56:22-32.
[18] 毛荐其,李莹莹,刘娜.技术距离视角下会聚对技术价值的影响[J].山东工商学院学报,2020,34(2):21-30.
[19] 吕璐成,赵亚娟,王学昭,等.基于专利共类和语义分析的技术融合分析方法及其应用[J].中国发明与专利,2021,18(2):3-12.
[20] 吕璐成,罗文馨,许景龙,等.专利情报方法、工具、应用研究进展及新技术应用趋势[J].情报学进展,2020,13:235-278.
[21] BENGIO, YOSHUA, COURVILLE, et al. Representation learning:a review and new perspectives[J]. IEEE transactions on pattern analysis & machine intelligence, 2013, 35(8):1798-1828.
[22] 张金柱,王玥,胡一鸣.基于专利科学引文内容表示学习的科学技术主题关联分析研究[J].数据分析与知识发现,2019,3(12):52-60.
[23] 张金柱,主立鹏,刘菁婕.基于表示学习的无监督跨语言专利推荐研究[J].数据分析与知识发现,2020,4(10):93-103.
[24] 吕璐成, 韩涛, 周健, 等. 基于深度学习的中文专利自动分类方法研究[J]. 图书情报工作, 2020, 64(10):75-85.
[25] 王贤文, 徐申萌, 彭恋, 等. 基于专利共类分析的技术网络结构研究:1971~2010[J]. 情报学报, 2013, 32(2):198-205.
[26] MIKOLOV T, CORRADO G, KAI C, et al. Efficient estimation of word representations in vector space[C]//Proceedings of the international conference on learning representations (ICLR 2013), 2013.
[27] DEVLIN J,CHANG M W,LEE K, et al. BERT:pre-training of deep bidirectional transformers for language understanding[C]//Proceedings of the 2019 conference of the north american chapter of the Association for Computational Linguistics:human language technologies, 2018:4171-4186.
[28] LEYDESDORFF L, CARLEY S, RAFOLS I. Global maps of science based on the new Web-of-Science categories[J]. Scientometrics, 2013,94(2):589-593.
[29] JENSEN P, LUTKOUSKAYA K. The many dimensions of laboratories' interdisciplinarity[J]. Scientometrics, 2014,98(1):619-631.
[30] 张静. 基于专利分析的技术融合特征研究[D].北京:中国科学院大学,2017.