图书情报工作 ›› 2014, Vol. 58 ›› Issue (02): 138-142.DOI: 10.13266/j.issn.0252-3116.2014.02.023

• 知识组织 • 上一篇    下一篇

基于动态LDA主题模型的内容主题挖掘与演化

胡吉明, 陈果   

  1. 武汉大学信息资源研究中心
  • 收稿日期:2013-11-13 修回日期:2014-01-04 出版日期:2014-01-20 发布日期:2014-01-20
  • 作者简介:胡吉明,武汉大学信息资源研究中心讲师,E-mail:whuhujiming@qq.com;陈果,武汉大学信息资源研究中心博士研究生。
  • 基金资助:

    本文系教育部人文社会科学青年基金项目“社会网络环境下信息内容主题挖掘与语义分类研究”(项目编号:13YJC870008)和国家自然科学青年基金项目“社会网络环境下基于用户-资源关联的信息推荐研究(项目编号:71303178)”研究成果之一。

Mining and Evolution of Content Topics Based on Dynamic LDA

Hu Jiming, Chen Guo   

  1. Center for Studies of Information Resources, Wuhan University, Wuhan 430072
  • Received:2013-11-13 Revised:2014-01-04 Online:2014-01-20 Published:2014-01-20

摘要:

指出文本内容主题的挖掘和演化研究对于文本建模和分类及推荐效果提升具有重要作用。从分析基于LDA主题模型的文本内容主题挖掘原理入手,针对当前网络环境下的文本内容特点,构建适用于动态文内容本主题挖掘的LDA模型,并通过改进的Gibbs抽样估计提高主题挖掘的准确性,进而从主题相似度和强度两个方面研究内容主题随时间的演化问题。实验表明,所提方法可行且有效,对后续有关文本语义建模和分类研究等具有重要的实践意义。

关键词: 主题挖掘, 主题演化, 动态LDA模型

Abstract:

The study of mining and evolution of text topics is of important significance for text modeling and classification, as well as the recommendation service. Starting from the analysis of theory of text topic modeling based on LDA, aiming at dynamic characters of text contents under social networking environment, this article constructed a dynamic LDA model for mining of text topics. Subsequently, the accuracy degree of topic mining was improved by incremental Gibbs sampling and estimation. Furthermore, the evolution of dynamic topics of text contents was achieved from the aspects of topic similarity and intensity. The experiment demonstrated that methods proposed in this article were feasible and effective, which will be the foundation of further study about semantic modeling and classification text.

Key words: topics mining, topics evolution, dynamic LDA model

中图分类号: