[Purpose/Significance] In public health emergencies, the rapid spread of rumors may cause mass anxiety and panic. This study aims to detect potential rumor spreaders in social media, explore and evaluate the important characteristics affecting rumor spreader identification, and provide strategies for public opinion control and network governance. [Method/Process] This study proposed a detection model for potential rumor spreaders based on multi-feature fusion in the context of public health emergencies. Firstly, the semantic features of Weibo were extracted by the BERT-BiLSTM model, and then fused with user features, Weibo features and emotion features. Finally, the user classification model was constructed based on LightGBM algorithm, and the model was explained by SHAP value. [Result/Conclusion] The experimental results show that the accuracy rate of the fusion multi-feature rumor spreader identification model for public health emergencies can reach 87.94% on the Weibo data set, indicating that the model has good detection effect. Moreover, the features of four dimensional proposed in this paper contribute to rumor spreader identification, and the text semantic features have the highest improvement in the accuracy of rumor spreader identification.
[1] 国务院政策网站.突发公共卫生事件应急条例[EB/OL].[2022-05-01]. http://www.gov.cn/zhengce/2020-12/26/content_5574586.htm.
[2] KOUZY R, ABI JAOUDE J, KRAITEM A, et al. Coronavirus goes viral:quantifying the COVID-19 misinformation epidemic on Twitter[J]. Cureus, 2020, 12(3):e7255.
[3] TIAN Y, FAN R, DING X, et al. Predicting rumor retweeting behavior of social media users in public emergencies[J]. IEEE access, 2020, 8:87121-87132.
[4] KARAMI M, NAZER T H, LIU H. Profiling fake news spreaders on social media through psychological and motivational factors[C]//Proceedings of the 32nd ACM conference on hypertext and social media. New York:Association for Computing Machinery, 2021:225-230.
[5] 石锴文,刘勘.突发公共卫生事件中微博谣言的识别[J].图书情报工作, 2021, 65(13):87-95.
[6] 孙冉,安璐.突发公共卫生事件中谣言识别研究[J].情报资料工作, 2021, 42(5):42-49.
[7] VOSOUGHI S, ROY D, ARAL S. The spread of true and false news online[J]. Science, 2018, 359(6380):1146-1151.
[8] SHU K, WANG S, LIU H. Understanding user profiles on social media for fake news detection[C]//Proceedings of the 2018 IEEE conference on multimedia information processing and retrieval (MIPR). Piscataway:IEEE, 2018:430-435.
[9] GUESS A, NAGLER J, TUCKER J. Less than you think:prevalence and predictors of fake news dissemination on Facebook[J]. Science advances, 2019, 5(1):eaau4586.
[10] YAQUB W, KAKHIDZE O, BROCKMAN M L, et al. Effects of credibility indicators on social media news sharing intent[C]//Proceedings of the 2020 CHI conference on human factors in computing systems. New York:Association for Computing Machinery, 2020:1-14.
[11] GIACHANOU A, RÍSSOLA E A, GHANEM B, et al. The role of personality and linguistic patterns in discriminating between fake news spreaders and fact checkers[C]//Proceedings of the 25th international conference on applications of natural language to information systems. Cham:Springer, 2020:181-192.
[12] LAMPRIDIS O, KARANATSIOU D, VAKALI A. Manifesto:a human-centric explainable approach for fake news spreaders detection[J]. Computing, 2022, 104(4):717-739.
[13] ZHOU Y, WU C, ZHU Q, et al. Rumor source detection in networks based on the SEIR model[J]. IEEE access, 2019, 7:45240-45258.
[14] CHEN X, SIN S C J.'Misinformation?What of it?' Motivations and individual differences in misinformation sharing on social media[J]. Proceedings of the American Society for Information Science and Technology, 2013, 50(1):1-4.
[15] LIANG G, HE W, XU C, et al. Rumor identification in microblogging systems based on users' behavior[J]. IEEE transactions on computational social systems, 2015, 2(3):99-108.
[16] VOGEL I, MEGHANA M. Detecting fake news spreaders on twitter from a multilingual perspective[C]//Proceedings of the 2020 IEEE 7th international conference on data science and advanced analytics. Piscataway:IEEE, 2020:599-606.
[17] KAR A K, ASWANI R. How to differentiate propagators of information and misinformation-insights from social media analytics based on bio-inspired computing[J]. Journal of information and optimization sciences, 2021, 42(6):1307-1335.
[18] RATH B, SALECHA A, SRIVASTAVA J. Detecting fake news spreaders in social networks using inductive representation learning[C]//Proceedings of the 2020 IEEE/ACM international conference on advances in social networks analysis and mining. Piscataway:IEEE, 2020:182-189.
[19] RATH B, MORALES X, SRIVASTAVA J. SCARLET:explainable attention based graph neural network for fake news spreader prediction[C]//Proceedings of the 25th Pacific-Asia conference on advances in knowledge discovery and data mining. Cham:Springer, 2021:714-727.
[20] APUKE O D, OMAR B. User motivation in fake news sharing during the COVID-19 pandemic:an application of the uses and gratification theory[J]. Online information review, 2021, 45(1):220-239.
[21] APUKE O D, OMAR B. Fake news and COVID-19:modelling the predictors of fake news sharing among social media users[J]. Telematics and informatics, 2021, 56:101475.
[22] LUO P, WANG C, GUO F, et al. Factors affecting individual online rumor sharing behavior in the COVID-19 pandemic[J]. Computers in human behavior, 2021, 125:106968.
[23] LEONARDI S, RIZZO G, MORISIO M. Automated classification of fake news spreaders to break the misinformation chain[J]. Information, 2021, 12(6):248.
[24] BODAGHI A, OLIVEIRA J. The characteristics of rumor spreaders on Twitter:a quantitative analysis on real data[J]. Computer communications, 2020, 160:674-687.
[25] DEVLIN J, CHANG M W, LEE K, et al. BERT:pre-training of deep bidirectional transformers for language understanding[C]//Proceedings of the 2019 conference of the North American Chapter of the Association for Computational Linguistics:human language technologies. Minneapolis:Association for Computational Linguistics, 2019:4171-4186.
[26] 刘欢,张智雄,王宇飞. BERT模型的主要优化改进方法研究综述[J].数据分析与知识发现, 2021, 5(1):3-15.
[27] 谌志群,鞠婷.基于BERT和双向LSTM的微博评论倾向性分析研究[J].情报理论与实践, 2020, 43(8):173-177.
[28] 李悦晨,钱玲飞,马静.基于BERT-RCNN模型的微博谣言早期检测研究[J].情报理论与实践, 2021, 44(7):173-177,151.
[29] 尹鹏博,潘伟民,张海军,等.基于BERT-BiGA模型的标题党新闻识别研究[J].数据分析与知识发现, 2021, 5(6):126-134.
[30] GRAVES A, SCHMIDHUBER J. Framewise phoneme classification with bidirectional LSTM and other neural network architectures[J]. Neural networks, 2005, 18(5/6):602-610.
[31] 徐琳宏,林鸿飞,潘宇,等.情感词汇本体的构造[J].情报学报, 2008, 27(2):180-185.
[32] KE G, MENG Q, FINLEY T, et al. LightGBM:a highly efficient gradient boosting decision tree[C]//Proceedings of the 31st annual conference on neural information processing systems. Red Hook:Curran Associates, 2017:3147-3155.
[33] WEN X, XIE Y, WU L, et al. Quantifying and comparing the effects of key risk factors on various types of roadway segment crashes with LightGBM and SHAP[J]. Accident analysis&prevention, 2021, 159:106261.
[34] LUNDBERG S M, LEE S I. A unified approach to interpreting model predictions[C]//Proceedings of the 31st annual conference on neural information processing systems. Red Hook:Curran Associates, 2017:4766-4775.作者贡献说明:曾子明:提出研究选题、研究思路和论文定稿;张瑜:构建实验模型、数据采集分析和论文撰写;李婷婷:设计技术路线和论文修改。