图书情报工作 ›› 2016, Vol. 60 ›› Issue (3): 130-137.DOI: 10.13266/j.issn.0252-3116.2016.03.019

• 综述述评 • 上一篇    下一篇

国内基于主题模型的科技文献主题发现及演化研究进展

王燕鹏1,2   

  1. 1. 中国科学院文献情报中心 北京 100190;
    2. 中国科学院大学 北京 100049
  • 收稿日期:2015-01-04 修回日期:2016-01-23 出版日期:2016-02-05 发布日期:2016-02-05
  • 作者简介:王燕鹏(ORCID:0000-0002-2583-9895),硕士研究生,E-mail:wangyanpeng@mail.las.ac.cn。

Research Progress of Scientific and Technical Literature Topic Detection and Evolution Based on Topic Model in China

Wang Yanpeng1,2   

  1. 1. National Science Library, Chinese Academy of Sciences, Beijing 100190;
    2. University of Chinese Academy of Sciences, Beijing 100049
  • Received:2015-01-04 Revised:2016-01-23 Online:2016-02-05 Published:2016-02-05

摘要: [目的/意义]分析中国国内基于主题模型的科技文献主题发现及演化研究进展,以期为相关研究人员提供参考借鉴及研究思路。[方法/过程]选取中国知网(CNKI)数据库及万方数据知识服务平台作为文献来源,检索并筛选相关文献,通过人工判读提炼出基于主题模型的科技文献主题发现及演化研究的分析流程,并采用文献分析法对流程中国内研究人员所使用到的策略、方法、分析手段等进行归纳和总结。[结果/结论]研究已初具规模,形成较为完整的分析流程,同时各个流程环节上所涉及到的策略、方法和分析手段较为多样化。另外,也存在着一些问题:主题模型方法在科技文献领域的应用尚且不成熟,主题数目固定,缺少对主题模型应用效果的评价方法与准则。

关键词: 主题模型, 主题发现, 主题演化, 文本预处理, 参数估计

Abstract: [Purpose/significance] This paper studiesresearch progress of scientific and technical literature topic detection and evolution based on topic model in China, to provide reference and idea for related researchers. [Method/process] It selects CNKI and WANGFANG DATA as data sources, retrievesand screensrelated articles, extractsthe analysis process of literature topic detection and evolution based on topic model by manual interpretation, and concludes the strategies and method that Chinese researchers use in the process with literature analysis method. [Result/conclusion] This related research is comparatively mature, and the analysis process is comparatively complete, and the strategies,methods and tools involved in every steps of analysis process are of diversity. On the other hand, there are some shortages, such as the application of topic model in literature topic detection and evolution is not that mature, the number of topics is constant, and it islack of evaluation methods and standards for application effect of topic model.

Key words: topic model, topic detection, topic evolution, text processing, parameter estimation

中图分类号: