图书情报工作 ›› 2019, Vol. 63 ›› Issue (7): 135-145.DOI: 10.13266/j.issn.0252-3116.2019.07.016

• 综述述评 • 上一篇    下一篇

作者主题模型及其改进的方法与应用研究综述

徐涵1,2, 刘小平1,2   

  1. 1. 中国科学院文献情报中心 北京 100190;
    2. 中国科学院大学经济与管理学院图书情报与档案管理系 北京 100190
  • 收稿日期:2018-07-02 修回日期:2018-11-27 出版日期:2019-04-05 发布日期:2019-04-05
  • 通讯作者: 刘小平(ORCID:0000-0002-3342-8041),副研究员,博士,硕士生导师,通讯作者,E-mail:liuxp@mail.las.ac.cn
  • 作者简介:徐涵(ORCID:0000-0002-0896-1637),硕士研究生。

A Review of Methods and Applications for Author-Topic Model and Its Improved Models

Xu Han1,2, Liu Xiaoping1,2   

  1. 1. National Science Library, Chinese Academy of Sciences, Beijing 100190;
    2. Department of Library, Information and Archives Management, School of Economics and Management, University of Chinese Academy of Sciences, Beijing 100190
  • Received:2018-07-02 Revised:2018-11-27 Online:2019-04-05 Published:2019-04-05

摘要: [目的/意义]作者主题模型作为近年来计算机领域关注度较高的新型概率模型,在文本挖掘与自然语言处理等方向已有广泛应用。分析国内外作者主题模型及其改进的思路与应用,更好地把握其研究现状,以期为计算机、图书情报等相关领域科研人员提供参考。[方法/过程]本文选取Web of Science核心数据库、DBLP及中国知网(CNKI)数据库作为文献来源,通过制定检索规则、去重及人工判读等操作提炼出关于作者主题模型及其改进方法的文献集,从模型应用过程的视角,结合文献分析法对现有研究进行总结归纳。[结果/结论]通过分析发现,现有相关研究已形成较为完整的分析流程,且模型的改进角度、适用领域也日益多样化。但性能优化、模型评价指标的规范完善以及在图书情报领域的进一步应用等方面仍有待深入探索。

关键词: 作者主题模型, 主题演化, 社区发现, 模型评估

Abstract: [Purpose/significance] Author-Topic model, as a new probabilistic model which has a high degree of attention in computer science, has been widely applied in text mining, natural language processing and other fields in recent years. This paper analyzes the ideas and applications of AT model and its improved models to grasp its research status and provide reference and ideas for researchers in computer science, library and information science or some other related fields. [Method/process] Using data sets on Web of Science Core Collection, DBLP and CNKI (China Academic Journals Full-text Database), a literature collection on Author-Topic model and its improved models is constructed through the establishment of retrieval rules, data de-duplication, artificial judgment and other operations. This paper summarizes the existing research based on literature analysis method from the perspective of the application process of the model. [Result/conclusion] The results show that the existing related research has formed a comparatively complete analysis process and the improvement angle and application area of the models are increasingly diversified. However, some problems, such as performance optimization, standardization and perfection and further application in the field of library and information science, still need to be explored in depth.

Key words: Author-Topic model, topic evolution, community detection, model evolution

中图分类号: