图书情报工作 ›› 2022, Vol. 66 ›› Issue (5): 133-141.DOI: 10.13266/j.issn.0252-3116.2022.05.014

• 知识组织 • 上一篇    下一篇

典籍事件触发动词识别研究:基于《左传》的文本实验

何琳1,2, 马晓雯1,3, 喻雪寒1,2, 艾毓茜1,2, 李章超1,2, 高丹1,2   

  1. 1. 南京农业大学信息管理学院 南京 210095;
    2. 南京农业大学人文与社会计算研究中心 南京 210095;
    3. 南京医科大学图书馆 南京 210029
  • 收稿日期:2021-08-01 修回日期:2021-11-21 出版日期:2022-03-05 发布日期:2022-03-21
  • 作者简介:何琳,教授,博士生导师,E-mail:helin@njau.edu.cn;马晓雯,硕士研究生;喻雪寒,博士研究生;艾毓茜,硕士研究生;李章超,博士研究生;高丹,博士研究生。
  • 基金资助:
    本文系国家社会科学基金项目“基于典籍的中华传统文化知识表达体系自动构建方法”(项目编号:18BTQ063)研究成果之一。

Research on Recognition of Verbs Triggered by Events in Ancient Classics:Textual Experiments Based on Zuo Zhuan

He Lin1,2, Ma Xiaowen1,3, Yu Xuehan1,2, Ai Yuxi1,2, Li Zhangchao1,2, Gao Dan1,2   

  1. 1. School of Information Management, Nanjing Agricultural University, Nanjing 210095;
    2. Center for Humanities and Social Computational Lab of Nanjing Agricultural University, Nanjing 210095;
    3. Nanjing Medical University Library, Nanjing 210029
  • Received:2021-08-01 Revised:2021-11-21 Online:2022-03-05 Published:2022-03-21

摘要: [目的/意义] 事件自动识别抽取是当前典籍主题挖掘研究中一个新的重要课题,其中事件触发词的识别是一项基础的工作,本研究旨在探索古代典籍中事件触发词自动识别和分类的通用方法。[方法/过程] 首先运用LDA模型对动词进行主题聚类,归纳典籍事件触发动词的分类体系;并依据聚类结果与分类体系,初步构建触发动词的种子词集。在此基础上,通过语义相似度计算,对种子词集进行扩展,构建典籍事件触发词语义数据集。在实验阶段,以先秦时期的重要典籍《左传》为例,对分类体系构建和种子词集扩展的方法进行验证。[结果/结论] 结果表明,本文所提出的典籍事件触发词识别方法可行有效,据此构建的事件触发词集具有较高可信度,未来可进一步扩大实验的样本数量及范围。

关键词: 触发词识别, 主题聚类, 词集扩展, 类别体系构建, 典籍文本

Abstract: [Purpose/significance] Automatic event recognition and extraction is an important topic in current research on topic mining of ancient classics. Among them, the recognition of event trigger words is a basic work, which determined the quality of event extraction. This article aims to explore the general methods of automatic recognition and classification of event trigger words in ancient classics. [Method/process] Firstly, we explored the method of trigger verb classification construction by LDA topic clustering, which was carried out on the ancient classics combined with qualitative analysis. After the classification schema was confirmed, we building a preliminary seeds set of trigger words based on the clustering results. Then we expanded the trigger verb seeds set by the semantic similarity calculation on the ancient classics text resources. In the experiment, we took Zuo Zhuan as the experiment data sources, which is an important ancient classics in the Period of Chunqiu. The experiment tested the results of trigger verb classification construction and the expanding efficiency of trigger verb from the seeds set. [Result/conclusion] The results show that the method proposed in this paper is feasible and effective, and the event trigger word set constructed based on this has a high degree of credibility. The sample size and scope of the experiment can be further expanded in the future.

Key words: trigger word recognition, topic clustering, word set expansion, classification system construction, ancient classic text

中图分类号: