图书情报工作 ›› 2020, Vol. 64 ›› Issue (11): 145-152.DOI: 10.13266/j.issn.0252-3116.2020.11.016

• 综述述评 • 上一篇    

新兴技术主题识别方法研究进展

刘小玲1,2, 谭宗颖1   

  1. 1 中国科学院文献情报中心 北京 100190;
    2 中国科学院大学经济与管理学院图书情报与档案管理系 北京 100190
  • 收稿日期:2019-04-18 修回日期:2019-11-17 出版日期:2020-06-05 发布日期:2020-06-05
  • 通讯作者: 谭宗颖(ORCID:0000-0003-3945-7174),研究员,博士生导师,通讯作者,E-mail:tanzy@mail.las.ac.cn
  • 作者简介:刘小玲(ORCID:0000-0001-7523-247X),助理研究员,博士研究生。
  • 基金资助:
    本文系国家自然科学基金项目"科学基金学科演变及资助政策研究"(项目编号:71843005)研究成果之一。

Progress on Methods of Emerging Technology Topics Identification

Liu Xiaoling1,2, Tan Zongying1   

  1. 1 National Science Library, Chinese Academy of Sciences, Beijing 100190;
    2 Department of Library, Information and Archives Management, School of Economics and Management, University of Chinese Academy of Sciences, Beijing 100049
  • Received:2019-04-18 Revised:2019-11-17 Online:2020-06-05 Published:2020-06-05

摘要: [目的/意义] 新兴技术主题识别不仅有助于及时跟踪技术发展动态,更能尽早捕捉技术领域未来的发展契机和可能的变化趋势。梳理新兴技术主题识别的定量研究方法,并对其优缺点进行比较,可以为新兴技术主题识别方法的改进和完善提供参考。[方法/过程] 首先对"新兴技术""新兴技术主题识别"等概念的内涵进行辨析;然后调研和系统梳理国内外新兴技术主题识别的定性和定量研究方法,重点关注以文献计量和数据挖掘为主的定量研究方法,并将其划分为三类:主题词或文献统计方法、引文网络聚类方法和文本挖掘分析方法;最后综合分析各类研究方法在技术主题抽取、新兴技术主题识别指标体系构建、方法有效性验证等方面的异同和存在的缺陷,以及对方法改进的初步思考。[结果/结论] 三类方法在新兴技术主题识别的主要步骤上各有特点和优劣,均有进一步完善的空间,未来可以探索利用深度学习等技术进行技术主题的准确抽取,并构建更加全面、系统的新兴技术主题识别指标体系,以及基于机器学习进行更加严格的方法有效性验证。

关键词: 新兴技术, 主题识别, 方法研究, 文献计量, 文本挖掘

Abstract: [Purpose/significance] Identification of emerging technology topics not only can contribute to track the development of technologies, but also can capture the future development opportunities and trends of technologies. Reviewing the quantitative methods of emerging technology topics identification and making a comparison of them can provide reference for an improvement of the methods. [Method/process] Firstly, concepts such as "emerging technology" and "emerging technology topics identification" were analyzed; then qualitative and quantitative research methods of emerging technology topics identification at home and abroad were investigated, focusing on bibliometrics and data mining. Quantitative methods were divided into 3 categories:keywords or documents statistical method, citation network clustering and text mining. Similarities, differences and shortcomings of above methods in the extraction of technology topics, construction of emerging technology topic identification indictors, methods verification were analyzed. Improvement methods are provided preliminarily. [Result/conclusion] The three types of methods have their own characteristics, advantages and disadvantages in the three steps of emerging technology topics identification, and there is room for further improvement. In the future, we can explore the use of techniques such as deep learning to identify technology topics accurately, and build a group of more comprehensive and systematic emerging technology topic identification indicators, as well as more rigorous method validation based on machine learning.

Key words: emerging technology, topics identification, methodology research, bibliometrics, text mining

中图分类号: