情报研究

采用实时线性模型的微博话题预警分析

  • 饶浩 ,
  • 文海宁
展开
  • 1. 韶关学院信息科学与工程学院 韶关 512005;
    2. 广西师范大学数学与统计学院 桂林 541004
饶浩(ORCID:0000-0001-9133-6025),副教授,硕士,E-mail:gdrh@sgu.edu.cn;文海宁(ORCID:0000-0001-6991-9822),硕士研究生。

收稿日期: 2017-04-14

  修回日期: 2017-06-04

  网络出版日期: 2017-08-05

基金资助

本文系教育部人文社会科学研究项目"社交媒体潜在舆情发现及导控机制研究"(项目编号:13YJCZH144)和广东省攀登计划项目"大学生微博热点话题趋势预测系统"(项目编号:pdjh2015a0471)研究成果之一。

Research on Microblog Topics Early Warning System Based on Real-time Linear Model

  • Rao Hao ,
  • Wen Haining
Expand
  • 1. School of Information Science and Engineering, Shaoguan University, Shaoguan 512005;
    2. School of Mathematics and Statistics, Guangxi Normal University, Guilin 541004

Received date: 2017-04-14

  Revised date: 2017-06-04

  Online published: 2017-08-05

摘要

[目的/意义]微博在当前信息传播中起着重要作用,为有效预测微博热点及舆情导控,建立实时线性预警模型。[方法/过程]将采集的指标进行缺失值和异常值的处理后,对微博话题热度与大V影响力因子进行因子分析与逐步回归的比较,筛选出公共影响因子;再对其加权,探索不同权重调节因子下的最佳定量公式;用此公式每次输入当前时刻起前3小时的数据,预测当前时刻起后30分钟的加权值对应的话题词,每隔10分钟重新更新一遍参数。[结果/结论]实验证明该预测模型能大大降低数据采集解析和预测时间,保持较好的准确率,并可通过选择合适的阈值,进一步提升精确度。

本文引用格式

饶浩 , 文海宁 . 采用实时线性模型的微博话题预警分析[J]. 图书情报工作, 2017 , 61(15) : 130 -137 . DOI: 10.13266/j.issn.0252-3116.2017.15.015

Abstract

[Purpose/significance] Microblog plays a significant role in information diffusion. Real-time linear model was set up in order to predict microblog hotspots and conduct public opinion effectively. [Method/process] Real-time linear model was used to predict the hot topic of microblog. Stepwise regression model was used to select impact factors affecting the hot topic of microblog. Missing value and outlier were processed. Comparison of microblog topics hotness and effectiveness factor of VIP was carried out according to factor analysis and stepwise regression. The common impact factors were filtered out and weighted. The appropriate formula was obtained by selecting different factors. The microblog data 3 hours before will be used before predicting microblog hot topic in 30 minutes. Parameters were updated every 10 minutes. [Result/conclusion] Experiments show that the prediction model could greatly reduce the time of data acquisition, analysis and prediction, maintain a relatively good accuracy. If a more appropriate threshold is selected, accuracy can be improved further.

参考文献

[1] HUANG G B, ZHOU H, DING X, et al. Extreme learning machine for regression and multiclass classification[J]. IEEE transactions on systems, man & cybernetics:Part B, 2012, 42(2):513-529.
[2] ZHAO L J, TANG J, CHAI T Y. Modeling spectral data based on mutual information and kernel extreme learning machines[C]//International conference on advances in neural networks. Berlin:Springer-Verlag, 2012:29-36.
[3] 杨长春, 王天允, 叶施仁. 微博意见领袖舆情危机管理能力评判体系研究——基于危机生命周期视角[J]. 情报科学, 2016, 34(6):19-25.
[4] 孙江华, 张殊. 基于主成分分析和聚类分析的传统报纸微博影响力研究[J]. 现代传播(中国传媒大学学报), 2015, 37(4):141-143.
[5] 孙茜,陈盛双. 新浪微博用户的人气值计算模型评估[EB/OL].[2017-01-14].http://www.paper.edu.cn/releasepaper/content/201301-612.
[6] 唐晓波, 向坤. 基于LDA模型和微博热度的热点挖掘[J]. 图书情报工作, 2014, 58(5):58-63.
[7] HE Y, TAN J. Study on SINA micro-blog personalized recommendation based on semantic network[J]. Expert systems with applications, 2015, 42(10):4797-4804.
[8] 经管之家, 徐筱刚,常国珍,等. 如虎添翼!数据处理的SPSS和SAS EG实现[M]. 北京:电子工业出版社, 2016:62-68.
[9] 李英乐. 微博传播效果预测技术研究[D]. 郑州:解放军信息工程大学, 2013.
[10] 郝建波. 微博突发话题检测、跟踪与传播预测技术研究[D]. 哈尔滨:哈尔滨工程大学, 2013.
[11] 郭景峰, 米浦波, 刘国华. 基于决策树的数据遗失值填充方法的研究[J]. 计算机工程与科学, 2002, 24(5):8-10.
[12] JANG C, YOUN B D, WANG P F, et al. Forward-stepwise regression analysis for fine leak batch testing of wafer-level hermetic MEMS packages[J]. Microelectronics reliability, 2010, 50(4):507-513.
[13] 刘功申,孟魁,谢婧. 一种微博预警算法[J]. 计算机科学,2014(12):33-37.
[14] 张金伟,刘晓平. 基于心理预警模型的微博情感识别研究[J]. 合肥工业大学学报(自然科学版),2013(11):1318-1322.
[15] PENG Y, WANG H. CMPK:a high accuracy microblog user classification method for professional analysis[C]//International conference on cloud and service computing. Piscataway:IEEE, 2014:134-139.
Options
文章导航

/