THEORETICAL STUDY

An Interpretable Model for New Government Media Public Value Consensus Integrating XGBoost and SHAP:Taking the Top 10 Municipal Government Accounts of the Jinri Toutiao as an Example

  • Yi Ming ,
  • Yao Yujia ,
  • Hu Min
Expand
  • 1. School of Information Management, Central China Normal University, Wuhan 430079;
    2. China Library Innovation and Development Research Center, Central China Normal University, Wuhan 430079

Received date: 2022-03-07

  Revised date: 2022-06-20

  Online published: 2022-08-19

Abstract

[Purpose/Significance] In order to identify the important factors and modes of action affecting public value consensus accurately and improve the ability and level of new government media to gather consensus, this paper proposes an interpretability model of public value of new government media that integrates XGBoost and SHAP. [Method/Process] The research objects are 500 government headline articles and 32,185 comments under "Jinri Toutiao". First, identify the public value consensus of the article, and then extract the article feature variables from the three dimensions of contents, forms, and emotions, and used the preprocessed data as the input of the model. Secondly, based on XGBoost to build a consensus prediction model for public value of government new media, and compare with other mainstream machine learning algorithms such as LR, SVM, LGBM, etc., it was to find the synthesized optimal model. Finally, the SHAP interpretation framework was introduced to quantify and attribute the importance of each feature variable. [Result/Conclusion] The results show that the XGBoost model is superior to the comparison model in the four performance indicators of accuracy, recall, F1-score, and AUC, with excellent performance. In addition, the study finds that the article topic type, public value type, article length, content form, article sentiment attribute, and the number of title emojis are important characteristics that affect the consensus of government headlines. There are differences in the way, direction and strength of their influence on the consensus of public value.

Cite this article

Yi Ming , Yao Yujia , Hu Min . An Interpretable Model for New Government Media Public Value Consensus Integrating XGBoost and SHAP:Taking the Top 10 Municipal Government Accounts of the Jinri Toutiao as an Example[J]. Library and Information Service, 2022 , 66(16) : 36 -47 . DOI: 10.13266/j.issn.0252-3116.2022.16.004

References

[1] CNNIC.第49次中国互联网络发展状况调查统计报告[EB/OL].[2022-02-25].http://www.cnnic.net.cn/hlwfzyj/hlwxzbg/hlwtjbg/202202/P020220407403488048001.pdf.
[2] CNNIC.第47次中国互联网络发展状况调查统计报告[EB/OL].[2022-02-16].http://www.cnnic.net.cn/hlwfzyj/hlwxzbg/hlwtjbg/202102/P020210203334633480104.pdf.
[3] 王程伟,马亮. 政务短视频如何爆发影响力:基于政务抖音号的内容分析[J].电子政务,2019(7):31-40.
[4] 贾哲敏,傅柳莺. 政务新媒体政策传播的现状、特征及发展趋势——以"上海垃圾分类政策"为例[J].北京航空航天大学学报(社会科学版),2020,33(3):72-78.
[5] CHEN Q, MIN C, ZHANG W, et al. Unpacking the black box: how to promote citizen engagement through government social media during the COVID-19 crisis[J]. Computers in human behavior, 2020, 110: 106380.
[6] 冀翠萍,马亮.如何评价移动政务服务质量?——基于省级政务客户端用户评论的探索性研究[J].湖北社会科学,2021(11):44-55.
[7] 唐梦斐,王建成. 突发事件中政务微博辟谣效果研究——基于"上海外滩踩踏事件"的案例分析[J].情报杂志,2015,34(8):98-103.
[8] 胡吉明,郑翔,程齐凯,等. 基于BiLSTM-CRF的政府微博舆论观点抽取与焦点呈现[J]. 情报理论与实践,2021,44(1):174-179.
[9] DENG W, HSU J H, LOFGREN K, et al. Who is leading China's family planning policy discourse in Weibo? a social media text mining analysis[J]. Policy & Internet, 2021, 13(4): 485-501.
[10] 邓喆,孟庆国,黄子懿,等. "和声共振":政务微博在重大疫情防控中的舆论引导协同研究[J].情报科学,2020,38(8):79-87.
[11] 陈娟,刘燕平,邓胜利. 政务微博辟谣信息传播效果的影响因 素研究[J].情报科学,2018,36(1):91-95.
[12] 王林,朱文静,潘陈益,等.基于p指数的微博传播力评价方法及效果探究——以我国34省、直辖市旅游政务官方微博为例[J].情报科学,2018,36(4):38-44.
[13] 马超,余辉,夏文蕾,等. 政务微博评论中情感极性分析方法研究——以上海公安机构微博为例[J].现代情报,2020,40(3):157-168.
[14] 纪雪梅,翟冉冉,王芳. 突发公共事件政务微博回应方式对公众评论情感的影响研究[J].情报理论与实践,2020,43(12):126-132.
[15] 汪祖柱,阮振秋.基于关联规则的政务微博公众评论观点挖掘[J].情报科学,2017,35(8):19-22.
[16] 陈强,高幸兴,陈爽,等.政务短视频公众参与的影响因素研究——以"共青团中央"政务抖音号为例[J].电子政务,2019(10):13-22.
[17] 陈强,张杨一,马晓悦,等.政务B站号信息传播效果影响因素与实证研究[J].图书情报工作,2020,64(22):126-134.
[18] LUNDBERG S M, LEE S I. A unified approach to interpreting model predictions[C]//Proceedings of the 31st international conference on neural information processing systems. La Jolla: NIPS, 2017: 4768-4777.
[19] 王国华,武晗. 从压力回应到构建共识:焦点事件的政策议程触发机制研究——基于54个焦点事件的定性比较分析[J]. 公共管理学报,2019,16(4):36-47.
[20] MOORE M H. Public value as the focus of strategy[J]. Australian journal of public administration, 1994, 53(3): 296-303.
[21] 王学军,张弘.公共价值的研究路径与前沿问题[J].公共管理学报,2013,10(2):126-136.
[22] 王学军,王子琦.政民互动、公共价值与政府绩效改进——基于北上广政务微博的实证分析[J].公共管理学报,2017,14(3):31-43.
[23] WOJTUCH A, JANKOWSKI R, PODLEWSKA S. How can SHAP values help to shape metabolic stability of chemical compounds? [J]. Journal of cheminformatics, 2021, 13 (1): 1-20.
[24] SHAPLEY L S. A value for n-person games[J]. Contributions to the theory of games, 1953, 2(28): 307-317.
[25] MASCHLER M, PELEG B, SHAPLEY L S. The kernel and bargaining set for convex games[J]. International journal of game theory, 1971, 1(1): 73-93.
[26] PARSA A B, MOVAHEDI A, TAGHIPOUR H, et al. Toward safer highways, application of XGBoost and SHAP for real-time accident detection and feature analysis [J]. Accident analysis & prevention, 2020, 136: 105405.
[27] LUNDBERG S M, ERION G, CHEN H, et al. Explainable AI for trees: from local explanations to global understanding[J]. arXiv preprint arXiv:1905.04610, 2019.
[28] 徐健.基于网络用户情感分析的预测方法研究[J].中国图书馆学报,2013,39(3):96-107.
[29] 马翔,包国宪.网络舆情事件中的公共价值偏好与政府回应绩效[J].公共管理学报,2020,17(2):70-83.
[30] LI Z, ZHANG Q, DU X, et al. Social media rumor refutation effectiveness: evaluation, modelling and enhancement[J]. Information processing & management, 2021, 58(1): 102420.
[31] 谢新洲,李冰.新媒体在凝聚共识中的主渠道作用与实现路径[J].新闻与传播研究,2016,23(5) : 5-11.
[32] 张曙光.社会表征理论述评——一种旨在整合心理与社会的 理论视角[J].国外社会科学,2008(5): 19-24.
[33] 史丽莉,谢梅.中国地方政务微博信息传播的效果研究[J].电子政务,2013(3):27-38.
[34] 陈娟,刘燕平,邓胜利.政府辟谣信息的用户评论及其情感倾向的影响因素研究[J].情报科学,2017,35(12):61-65.
[35] HOFFMAN M L. How automatic and representational is empathy, and why[J]. Behavioral and brain sciences, 2002, 25(1): 38-39.
[36] ZHANG L, PENG T Q, ZHANG Y P, et al. Content or context: which matters more in information processing on microblogging sites[J]. Computers in human behavior, 2014, 31(1): 242-249.
[37] 苏丽敏,何慧爽.基于区间数的Spearman秩相关系数的多属性决策方法[J].统计与决策, 2019, 35 (6): 51-53.
[38] 栾梦,孙多勇,李占锋,等.基于GRA-SVR的恐怖风险情报预测模型——以"一带一路"为例[J].情报杂志,2020,39(3):36-41.
[39] FRIEDMAN J H. Greedy function approximation: a gradient boosting machine[J]. Annals of statistics, 2001, 29(5):1189-1232.
[40] Baidu AipNLP[EB /OL].[2022- 04-31].https: //pypi.org /project /baidu-aip.
[41] HARRISON T, PARDO T, CRESSWELL A, et al. Delivering public value through open government[EB/OL].[2022-02-25].https:// www.ctg.albany.edu/media/pubs/pdfs/opengov_pubvalue.pdf.
[42] CLORE G L, SCHWARZ N, CONWAY M. Affective causes and consequences of social information processing[M]//WYER R S, SRULL T K. Handbook of social cognition. Hillsdale: Lawrence Erlbaum,1994: 323-418.
Outlines

/