[Purpose/significance] This paper conducted a LDA topic analysis on "the Belt and Road" related news content in official medias and built a basic framework of news topic analysis using LDA model to help the public understand the dynamics and progress of the initiative and its focus in different periods.[Method/process] This paper selected "the Belt and Road" related news on the Chinese government Website during 2015 to 2017, and conducted the topic extraction and heat evolution analysis using LDA model.[Result/conclusion] A total of 30 topics were extracted and summarized as seven categories called policy coordination, facilities connectivity, unimpeded trade, financial integration, people-to-people bond, economic impact and government work. Among them, the policy coordination category has the highest heat during whole time period. Unimpeded trade category and economic impact category are the second and third highest. The heat of some topics, such as "reform and transformation", decline over time, while others like "import and export" increase. These results reflect the changes in the attention of the official media to different news topics related with "the Belt and Road".
Qin Yue
,
Wu Yaping
,
Wang Jimin
. An Analysis of News Topics Mining Based on LDA Model: Taking “The Belt and Road” Related News as an Example[J]. Library and Information Service, 2019
, 63(15)
: 103
-110
.
DOI: 10.13266/j.issn.0252-3116.2019.15.012
[1] 杜德斌, 马亚华. "一带一路":中华民族复兴的地缘大战略[J]. 地理研究, 2015, 34(6):1005-1014.
[2] 姚玉娇. 《人民日报》"一带一路"专题报道新闻框架研究[D].乌鲁木齐:新疆大学, 2017.
[3] 曾润喜, 魏冯. "一带一路"国家战略的舆论引导评价研究[J]. 情报杂志, 2017, 36(5):90-94.
[4] 汪海藻. 《中国日报》与《中国日报·美国版》"一带一路"报道比较研究[D].广州:广东外语外贸大学,2017.
[5] 田作宇. 基于语料库的印度英文报纸中"一带一路"相关新闻的态度研究[D]. 北京:北京外国语大学, 2017.
[6] SALTON G, YANG C S. On the specification of term values in automatic indexing[J]. Journal of documentation, 1973, 29(4):351-372.
[7] ALLAN J, PAPKA R, LAVRENKO V. On-line new event detection and tracking[C]//ACM SIGIR Forum. Amherst:University of Massachusetts, 1998:37-45.
[8] 林南. 基于Web舆情的话题识别与追踪技术研究[D]. 福州:福州大学, 2014.
[9] 陈龙. 新闻热点话题发现及演化分析研究与应用[D]. 南京:南京理工大学, 2017.
[10] LAVRENKO V, ALLAN J, DEGUZMAN E, et al. Relevance models for topic detection and tracking[C]//Proceedings of the second international conference on human language technology research. San Francisco:Morgan Kaufmann Publishers Inc., 2002:115-121.
[11] ZHAI C, LAFFERTY J. A study of smoothing methods for language models applied to ad hoc information retrieval[C]//Proceedings of the 24th annual international ACM SIGIR conference on research and development in information retrieval.New York:ACM,2001:334-342.
[12] HOFMANN T. Probabilistic latent semantic analysis[C]//Proceedings of the fifteenth conference on uncertainty in artificial intelligence. San Francisco:Morgan Kaufmann Publishers Inc., 1999:289-296.
[13] BLEI D M, NG A Y, JORDAN M I. Latent dirichlet allocation[J]. Journal of machine learning research, 2003, 3:993-1022.
[14] BLEI D M, LAFFERTY J D. Dynamic topic models[C]//Proceedings of the 23rd international conference on machine learning. New York:ACM, 2006:113-120.
[15] ALSUMAIT L, BARBARÁ D, DOMENICONI C. On-line LDA:adaptive topic models for mining text streams with applications to topic detection and tracking[C]//Proceedings of the 8th IEEE international conference on data mining. Washington:IEEE Computer Society, 2008:3-12.
[16] 楚克明, 李芳. 基于LDA模型的新闻话题的演化[J]. 计算机应用与软件, 2011, 28(4):4-7.
[17] GRIFFITHS T L, STEYVERS M. Finding scientific topics[J]. Proc. national academy of sciences, 2004, 101(1):5228-5235.
[18] 周振宇. 基于LDA的微博与传统媒体的话题对比研究[D]. 上海:上海交通大学, 2013.
[19] STEVENS K, KEGELMEYER P, ANDRZEJEWSKI D, et al. Exploring topic coherence over many models and many topics[C]//Proceedings of the 2012 joint conference on empirical methods in natural language processing and computational natural language learning. Jeju island:Association for Computational Linguistics, 2012:952-961.