[目的/意义]人物和情节是数据故事的两大支柱。数据故事的情节通过人物特征、行为、所期待目标、所面对现实和所认为偏见来展开,实现数据故事人物的自动化生成是数据故事化领域科学研究的核心主题之一,对于数据故事的理论研究、自动生成和工程化研发具有重要意义。[方法/过程]首先,探讨数据故事人物的类型、特征及操作。其次,提出基于反事实解释的人物生成方法,分别对数据故事中的主人公、同类人物、异类人物、正面人物和反面人物给出自动生成方法。接着,分析其技术实现,探讨实验设计、数据来源、方法选择及结果讨论。最后,总结论文的主要研究发现,并对未来研究提出建议。[结果/结论]在数据故事化领域首次较为系统研究数据故事人物的组成要素、基本类型、主要特征及核心操作,并提出基于反事实的数据故事人物自动生成方法。
[Purpose/Significance] Characters and narratives form the dual pillars of data stories. The narrative trajectory of a data story is shaped by the characters' traits, behaviors, goals, realities, and biases. Achieving automatic generation of data story characters stands as a core scientific research topic in the realm of data storytelling. Addressing this central issue carries significance for the theoretical exploration, automation, and engineering-oriented research and development in the domain of data stories. [Method/Process] Initially, this study delved into the types, attributes, and operations associated with data story characters. Subsequently, it proposed a character generation technique based on counterfactual reasoning, offering automatic generation algorithms for protagonists, similar characters, heterogeneous characters, positive characters, and negative characters in data stories. Following this, it dissected and discussed its technical implementation, furnishing the paper's experimental design, data sources, method selection, and result discussion. Lastly, it encapsulated the principal research findings of the paper and furnished a forward-looking perspective. [Result/Conclusion] In the data storytelling, this paper introduces for the first time the compositional elements, basic types, principal features, and core operations of data story characters. Furthermore, this study presents an automatic generation method for data story characters rooted in counterfactual reasoning.
[1] 朝乐门.数据故事的自动生成与工程化研发[J].情报资料工作, 2021, 42(2):53-62. (CHAO L M.Automatic generation and engineering research & development of data stories[J].Information and documentation services, 2021, 42(2):53-62.)
[2] 朝乐门.数据故事化[M].北京:电子工业出版社, 2022:54. (CHAO L M.Data storytelling:from data perception to data cognition[M].Beijing:Publishing House of Electronics Industry, 2022:54.)
[3] FRIJDA N H.The laws of emotion[M].New York:Psychology Press, 2017.
[4] HOUGHTON K J, KLIN C M.Do readers remember what story characters remember[J].Discourse processes, 2020, 57(1):1-16.
[5] HAMBY A, VAN LAER T.Not whodunit but whydunit:story characters' motivations influence audience interest in services[J].Journal of service research, 2022, 25(1):48-65.
[6] SUN M, CAI L, CUI W, et al.Erato:cooperative data story editing via fact interpolation[J].IEEE transactions on visualization and computer graphics, 2022, 29(1):983-993.
[7] LEE B, RICHE N H, ISENBERG P, et al.More than telling a story:transforming data into visually shared stories[J].IEEE computer graphics and applications, 2015, 35(5):84-90.
[8] OJO A, HERAVI B.Patterns in award winning data storytelling:story types, enabling tools and competences[J].Digital journalism, 2018, 6(6):693-718.
[9] ZHANG Y, LUGMAYR A.Designing a user-centered interactive data-storytelling framework[C]//Proceedings of the 31st Australian conference on human-computerinteraction.New York:Association for Computing Machinery, 2019:428-432.
[10] SHI Y, GAO T, JIAO X, et al.Breaking the fourth wall of data stories through interaction[J].IEEE transactions on visualization and computer graphics, 2022, 29(1):972-982.
[11] HAROLD F.Story and discourse:narrative structure in fiction and film[J].Poetics today, 1980, 1(3):79.
[12] ZHANG Y, REYNOLDS M, LUGMAYR A, et al.A visual data storytelling framework[J].Informatics, 2022, 9(4):73.
[13] GUIDOTTI R, MONREALE A, RUGGIERI S, et al.A survey of methods for explaining black box models[J].ACM computing surveys (CSUR), 2018, 51(5):1-42.
[14] WACHTER S, MITTELSTADT B, RUSSELL C.Counterfactual explanations without opening the black box:automated decisions and the GDPR[J].Harvard.Journal of Law & Technology.2018, 31(2):841.
[15] MILLER T.Explanation in artificial intelligence:Insights from the social sciences[J].Artificial intelligence, 2019, 26(7):1-38.
[16] MOTHILAL R K, SHARMA A, TAN C.Explaining machine learning classifiers through diverse counterfactual explanations[C]//Proceedings of the 2020 conference on fairness, accountability, and transparency.New York:ACM, 2020:607-617.
[17] VERMA S, DICKERSON J, HINES K.Counterfactual explanations for machine learning:a review[J].arXiv preprint, 2020, arXiv:2010.10596.
[18] MOLNAR C.Interpretable machine learning:a guide for making black box models explainable[M].Seattle:Independently Published, 2022:194.
[19] 王明, 武文芳, 王大玲, 等.生成链接树:一种高数据真实性的反事实解释生成方法[J].计算机科学, 2022, 49(9):33-40. (WANG M, WU W F, WANG D L, et al.Generating link trees:a counterfactual explanations generation approach with high data reality[J].Computer science, 2022, 49(9):33-40.)
[20] ADAM K D P B J.A method for stochastic optimization[J].arXiv preprint arXiv:1412.6980, 2014, 1412.
[21] USTUN B, SPANGHER A, LIU Y.Actionable recourse in linear classification[C]//Proceedings of the conference on fairness, accountability, and transparency.New York:ACM, 2019:10-19.
[22] KOMMIYA MOTHILAL R, SHARMA A, TAN C.Explaining machine learning classifiers through diverse counterfactual explanations[J].arXiv e-prints, 2019, arXiv:1905.07697.
[23] POYIADZI R, SOKOL K, SANTOS-RODRIGUEZ R, et al.FACE:feasible and actionable counterfactual explanations[C]//Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society.New York:ACM, 2020:344-350.
[24] SMYTH B, KEANE M T.Good counterfactuals and where to find them:a case-based technique for generating[C]//ICCBR 2020:case-based reasoning research and development.Cham:Springer, 2020:163-178.
[25] GOYAL Y, WU Z, ERNST J, et al.Counterfactual visual explanations[J].arXiv preprint, 2019, arXiv:1904.07451.
[26] WACHTER S, MITTELSTADT B, RUSSELL C.Counterfactual explanations without opening the black box:automated decisions and the GDPR[J].Harvard journal of law & technology, 2018, 31(2):841-887.
[27] LAUGEL T, LESOT M J, MARSALA C, et al.Issues with posthoc counterfactual explanations:a discussion[J].arXiv preprint, 2019, arXiv:1906.04774.
[28] MAHAJAN D, TAN C, SHARMA A.Preserving causal constraints in counterfactual explanations for machine learning classifiers[J].arXiv preprint, 2019, arXiv:1912.03277.
[29] GUIDOTTI R, MONREALE A, GIANNOTTI F, et al.Factual and counterfactual explanations for black box decision making[J].IEEE intelligent systems, 2019, 34(6):14-23.
[30] GUIDOTTI R, MONREALE A, GIANNOTTI F, et al.Factual and counterfactual explanations for black box decision making[J].IEEE intelligent systems, 2019, 34(6):14-23.