[目的/意义]数据获取是网络舆情研究的第一个阶段,在大量数据面前,构建网络舆情推文热度测度模型能够快速筛选出能为网络舆情研究所用的数据。[方法/过程]借鉴信息论中平均自信息量的定义,使用层次分析法与Haker News排名算法构建网络舆情热度测度模型。[结果/结论]通过在微博抓取数据,计算得出针对该数据集的热度阈值,验证该热度测度模型的准确度。事实证明,网络舆情推文热度测度模型能够很好地完成推文热度的计算,并且能够达到较高的计算准确率。
[Purpose/significance] Data Collection is the first step of the study of Network Public Opinion. The construction of Heat Assessment Model for Tweets of Network Public Opinion will rapidly screen useful data over dramatic number of data.[Method/process] This paper cites the definition of Average Self-Information, applies Analytic Hierarchy Process (AHP) and Haker News Ranking Algorithm to construct a Heat Assessment Model for Tweets of Network Public Opinion.[Result/conclusion] Through the calculation of data collected from Weibo, this paper obtains the threshold of this data set. Then this paper tests the accuracy of the model, which proves this model could achieve the heat calculation precisely.
[1] 梁昌明,李冬强.基于新浪热门平台的微博热度评价指标体系实证研究[J].情报学报,2015,34(12):1278-1283.
[2] 杜慧,郭岩,范意兴,等.基于因果模型的主题热度计算与预测方法[J].中文信息学报,2016,30(2):50-55.
[3] 徐旖旎.基于微博的媒体奇观网络舆情热度趋势分析[J].情报科学,2017,35(2):92-97,125.
[4] 黄微,王洁晶,赵江元.微博舆情信息老化测度研究[J].情报资料工作,2017(6):6-11.
[5] 何跃,蔡博驰.基于因子分析法的微博热度评价模型[J].统计与决策,2016(18):52-54.
[6] 饶浩,文海宁.采用实时线性模型的微博话题预警分析[J].图书情报工作,2017,61(15):130-137.
[7] GLENSKI M, PENNYCUFF C, WENINGER T. Consumers and curators:browsing and voting patterns on reddit[J].IEEE transactions on computational social systems, 2017, 4(4):196-206.
[8] BERKHIN P. A survey on pagerank computing[J].Internet mathematics, 2005, 2(1):73-120.
[9] 邓一贵,伍玉英.基于文本内容的敏感词决策树信息过滤算法[J].计算机工程,2014,40(9):300-304.