图书情报工作 ›› 2022, Vol. 66 ›› Issue (19): 26-35.DOI: 10.13266/j.issn.0252-3116.2022.19.003

所属专题: 面向数字人文研究的稷下学文献资料数据库建设研究

• 专题:面向数字人文研究的稷下学文献资料数据库建设研究 • 上一篇    下一篇

面向数字人文的稷下思想自动分类研究

冯梦莹1, 白如江1, 张玉洁1, 王效岳1, 耿振东2, 王志民2   

  1. 1 山东理工大学信息管理研究院 淄博 255049;
    2 山东理工大学齐文化研究院 淄博 255049
  • 收稿日期:2022-04-21 修回日期:2022-08-20 出版日期:2022-10-05 发布日期:2022-10-25
  • 通讯作者: 白如江,教授,博士生导师,通信作者,E-mail:brj@sdut.edu.cn。
  • 作者简介:冯梦莹,硕士研究生;张玉洁,硕士研究生;王效岳,教授,硕士生导师;耿振东,副院长,《管子学刊》主编;王志民,教授,博士生导师
  • 基金资助:
    本文系教育部哲学社会科学研究重大课题攻关项目"稷下学派文献整理与数据库建设研究"(项目编号:19JZD011)研究成果之一。

Research on Automatic Classification of Jixia Thought for Digital Humanities

Feng Mengying1, Bai Rujiang1, Zhang Yujie1, Wang Xiaoyue1, Geng Zhengdong2, Wang Zhimin2   

  1. 1 Institute of Information Management, Shandong University of Technology, Zibo 255049;
    2 Qiculture Research Institute, Shandong University of Technology, Zibo 255049
  • Received:2022-04-21 Revised:2022-08-20 Online:2022-10-05 Published:2022-10-25

摘要: [目的/意义] 稷下思想是先秦百家争鸣时期的沧海遗珠,研究如何从稷下研究文献中自动识别出稷下思想,为稷下学数字人文研究提供方法基础。[方法/过程] 选取《管子学刊》作为研究数据源,对其收录的部分文本进行11大类附属42小类的思想类别归纳,构建训练数据集,提出一种基于ERNIE微调的JixiaERNIE模型,将稷下思想自动识别映射为文本自动分类问题,利用模型进行自动分类识别。[结果/结论] 通过实验对比得出,构建的JixiaERNIE模型在学习率4e-5、迭代次数为10分类效果达到最优,与基线模型相比,F值提高了7.9%。为进一步增强模型识别分类效果,在模型连接层的基础上加入分类器对比,有效实现面向数字人文研究的稷下思想自动分类任务。

关键词: 数字人文, 自动分类, 管子学刊, 稷下思想, JixiaERNIE

Abstract: [Purpose/Significance] Jixia Thought is the relic of the sea in the contend period of a hundred schools of thought in the pre-Qin period. This paper studies how to automatically identify the thought of Jixia from the Jixia research literature, and provides a method basis for digital humanities research in Jixia.[Method/Process] This paper selected GUAN ZI JOURNAL as the research data source, summarized the text included 11 categories affiliated 42 categories of thought category induction, built the training data set, and put forward a JixiaERNIE model based on ERNIE fine-tuning that mapping Jixia thought auto-identification to text auto-classification problem and using the proposed model for automatic classification identification.[Result/Conclusion] Through experimental comparison, the JixiaERNIE model constructed in this paper achieves the best effect when learning rate of 4e-5 and iterations of 10 classification, increasing the F value by 7.9% compared to the baseline model. In order to further enhance the classification effect of model recognition, the classifier comparison is added based on the model connection layer, which effectively realizes the automatic classification task of Jixia ideas for digital humanities research.

Key words: digital humanities, automatic classification, GUAN ZI JOURNAL, Jixia thought, JixiaERNIE

中图分类号: