图书情报工作 ›› 2022, Vol. 66 ›› Issue (7): 120-131.DOI: 10.13266/j.issn.0252-3116.2022.07.012

• 知识组织 • 上一篇    下一篇

自有知识增强下的学术全文本关系抽取研究

卓可秋1, 沈思2, 王东波1   

  1. 1. 南京农业大学信息管理学院 南京 210095;
    2. 南京理工大学经济管理学院 南京 210094
  • 收稿日期:2021-11-24 修回日期:2022-01-19 出版日期:2022-04-05 发布日期:2022-04-15
  • 通讯作者: 王东波,教授,博士生导师,通信作者,E-mail:db.wang@njau.edu.cn。
  • 作者简介:卓可秋,博士研究生;沈思,副教授,博士生导师。
  • 基金资助:
    本文系江苏省自然科学基金青年项目"基于深度学习的学术全文本时态语义知识标识及检索模型构建研究"(项目编号:BK20190450)和国家自然科学基金面上项目"基于深度学习的学术全文本知识图谱构建及检索研究"(项目编号:71974094)研究成果之一。

Research on Relation Extraction of Academic Full-Text Based on Self-Owned Knowledge Enhancement

Zhuo Keqiu1, Shen Si2, Wang Dongbo1   

  1. 1. School of Information Management, Nanjing Agricultural University, Nanjing 210095;
    2. School of Economics and Management, Nanjing University of Technology, Nanjing 210094
  • Received:2021-11-24 Revised:2022-01-19 Online:2022-04-05 Published:2022-04-15

摘要: [目的/意义] 学术全文本下的关系抽取是学术全文本知识图谱构建的关键技术,所构建的学术知识图谱能够实现文献的结构化、知识化,提高研究人员检索文献、分析文献和把握科研动态的效率,以及通过图谱的认知推理,有助于隐式知识发现。[方法/过程] 通过外部知识来增强关系抽取已在不少研究取得成果,但针对特定领域的关系抽取往往缺少可用的外部知识。研究发现,全文本中自有的高置信度的知识也可以用来辅助全文本关系抽取。受认知过程双系统理论(系统1为直觉认知,系统2为推理认知)启发,设计一个句子级模型来获取知识,并通过远程监督方式获取高置信度知识,然后将高置信度知识融入到全文本级深度学习模型最后分类的一层上。[结果/结论] 在生物医学学术全文本数据集(CDR-revised)上,比当前最先进的模型在F1上提高11.13%。

关键词: 学术全文本, 关系抽取, 自有知识增强, 知识图谱

Abstract: [Purpose/Significance] Relation extraction under academic full-text is the key technology for the construction of academic full-text knowledge graph. The constructed academic knowledge graph can realize the structure and knowledge of documents, and improve the efficiency of researchers retrieving documents, analyzing documents and grasping scientific research trends, and cognitive reasoning through graphs contributes to implicit knowledge discovery.[Method/Process] Enhancing relation extraction through external knowledge has achieved results in many studies, but relation extraction for specific fields often lacked available external knowledge. The research in this paper found that the high-confidence knowledge in the full-text could also be used to assist the extraction of full-text relations. For this reason, based on the dual-system theory of cognitive processes (system 1 is intuitive cognition, system 2 is reasoning cognition), this paper designed a sentence-level model to acquire knowledge, and obtained high-confidence knowledge through remote supervision, and then high-confidence knowledge was integrated into the final classification layer of the text-level deep learning model.[Result/Conclusion] On the biomedical academic full-text data set (CDR-revised), the F1 is about 11.13% higher than the current state-of-the-art model.

Key words: academic full-text, relation extraction, self-owned knowledge enhancement, knowledge graph

中图分类号: