[Purpose/significance] This study aimed to predict whether microblog users would retweet or comment on the microblog entries containing wanted information. We also evaluated the important features that affected the spread of wanted microblog entries to help the public security departments improve their operation performance and enhance the communication and cooperation between the police and the public. [Method/process] Based on the characteristics of the wanted microblogging, we combined user features, time features and structure features, and extracted event features in microblog entries, such as location keywords, time keywords, the wanted level and so on. The Xgboost algorithm was used to calculate the importance of different features in the retweet and comment prediction. In combination with the features of transmission network and node attributes, we trained and evaluated a prediction model based on heterogeneous information network embedding. [Result/conclusion] The values of the AUC in retweeting and commenting data sets are 0.737 and 0.799 respectively. As the model integrated network structure characteristics and different nodes' attributes, it was closer to the heterogeneous information network in reality and had higher accuracy than the traditional link prediction model. In addition, the result of features' importance showed that the keyword features of the proposed event features had the highest importance among all the features that affected the prediction of microblog entries retweeted and commented.
Sun Ran
,
An Lu
. Propagation Prediction of Police Microblog Entries Based on Heterogeneous Information Network[J]. Library and Information Service, 2020
, 64(21)
: 67
-76
.
DOI: 10.13266/j.issn.0252-3116.2020.21.010
[1] 郑建国,朱君璇,曹如中.基于情境的社交网络信息传播链路预测研究[J].情报理论与实践,2018,41(6):94-99.
[2] 肖飞.公共危机事件中政务微博的舆情信息工作理念与策略探析——以雅安地震为例[J].图书情报工作,2014,58(1):44-47,71.
[3] 刘小平, 田晓颖. 传统媒体与新媒体微博社会网络特征对比分析实证研究[J]. 图书情报工作, 2018, 62(5):106-114.
[4] 张连峰, 周红磊, 王丹, 等. 基于超网络理论的微博舆情关键节点挖掘[J]. 情报学报, 2019, 38(12):1286-1296.
[5] 陈然, 刘洋. 基于转发行为的政务微博信息传播模式研究[J]. 电子政务, 2017(7):108-117.
[6] LI L, ZHANG Q, TIAN J, et al. Characterizing information propagation patterns in emergencies:a case study with Yiliang Earthquake[J]. International journal of information management, 2018, 38(1):34-41.
[7] 陈贵梧.地方政府创新过程中正式与非正式政治耦合研究——以公安微博为例[J].公共管理学报,2014,11(2):60-69,141-142.
[8] 安璐,易兴悦,孙冉.恐怖事件情境下微博影响力的预测及演化[J].图书情报知识,2019(4):52-61,81.
[9] 徐月梅, 刘韫文, 蔡连侨. 基于深度融合特征的政务微博转发规模预测模型[J]. 数据分析与知识发现, 2020, 4(2/3):18-28.
[10] BOYD D M, ELLISON N B. Social network sites:definition, history, and scholarship[J]. Journal of computer-mediated communication, 2007, 13(1):210-230.
[11] 黄微,刘熠,许烨婧,等.网络舆情推文的热度测度模型构建[J].图书情报工作,2019,63(20):17-25.
[12] SUH B, HONG L, PIROLLI P, et al. Want to be retweeted? large scale analytics on factors impacting retweet in twitter network[C]//Proceedings of the 2010 IEEE second international conference on social computing. Washington, DC:IEEE Computer Society, 2010:177-184.
[13] 田磊,任国恒,王伟.面向阅读推广的微博用户转发行为预测[J].情报学报,2017,36(11):1175-1182.
[14] ZHU J, XIONG F, PIAO D, et al. Statistically modeling the effectiveness of disaster information in social media[C]//Proceedings of the 2011 IEEE global humanitarian technology conference. Seattle:IEEE Computer Society, 2011:431-436.
[15] JIANG B, LIANG J, SHA Y, et al. Retweeting behavior prediction based on one-class collaborative filtering in social networks[C]//Proceedings of the 39th international ACM SIGIR conference on research and development in information retrieval. New York:Association for Computing Machinery, 2016:977-980.
[16] KHAN M L. Social media engagement:what motivates user participation and consumption on YouTube?[J]. Computers in human behavior, 2017, 66:236-247.
[17] JACCARD P. The distribution of the flora in the alpine zone. 1[J]. New phytologist, 1912, 11(2):37-50.
[18] ADAMIC L, ADAR E. How to search a social network[J]. Social networks, 2005, 27(3):187-203.
[19] BELKIN M, NIYOGI P. Laplacian eigenmaps and spectral techniques for embedding and clustering[C]//Proceedings of the 14th international conference on neural information processing systems:natural and synthetic. Cambridge:MIT Press, 2002:585-591.
[20] AHMED A, SHERVASHIDZE N, NARAVANAMURTHY S, et al. Distributed large-scale natural graph factorization[C]//Proceedings of the 22nd international conference on World Wide Web. New York:Association for Computing Machinery, 2013:37-48.
[21] ZHU D, CUI P, ZHANG Z, et al. High-order proximity preserved embedding for dynamic networks[J]. IEEE transactions on knowledge and data engineering, 2018, 30(11):2134-2144.
[22] CAO S, LU W, XU Q. Grarep:learning graph representations with global structural information[C]//Proceedings of the 24th ACM international on conference on information and knowledge management. New York:Association for Computing Machinery, 2015:891-900.
[23] PEROZZI B, Al-RFOU R, SKIENA S. Deepwalk:online learning of social representations[C]//Proceedings of the 20th ACM SIGKDD international conference on knowledge discovery and data mining. New York:Association for Computing Machinery, 2014:701-710.
[24] GROVER A, LESKOVEC J. Node2vec:scalable feature learning for networks[C]//Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining. New York:Association for Computing Machinery, 2016:855-864.
[25] RIBEIRO L F R, SAVERESE P H P, FIGUEIREDO D R. Struc2vec:learning node representations from structural identity[C]//Proceedings of the 23rd ACM SIGKDD international conference on knowledge discovery and data mining. New York:Association for Computing Machinery, 2017:385-394.
[26] TANG J, QU M, WANG M, et al. Line:large-scale information network embedding[C]//Proceedings of the 24th international conference on World Wide Web. Geneva:International World Wide Web Conferences Steering Committee, 2015:1067-1077.
[27] KIPF T N, WELLING M. Semi-supervised classification with graph convolutional networks[C]//Proceedings of the 5th international conference on learning representations. 2017:1-14.
[28] CEN Y, ZOU X, ZHANG J, et al. Representation learning for attributed multiplex heterogeneous network[C]//Proceedings of the 25th ACM SIGKDD international conference on knowledge discovery and data mining. New York:Association for Computing Machinery, 2019:1358-1368.
[29] CHEN T, GUESTRIN C. Xgboost:a scalable tree boosting system[C]//Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining. New York:Association for Computing Machinery, 2016:785-794.
[30] SUH B, HONG L, PIROLLI P, et al. Want to be retweeted? large scale analytics on factors impacting retweet in Twitter network[C]//Proceedings of the 2010 IEEE second international conference on social computing. Washington, DC:IEEE Computer Society, 2010:177-184.