图书情报工作 ›› 2020, Vol. 64 ›› Issue (11): 108-115.DOI: 10.13266/j.issn.0252-3116.2020.11.012

• 知识组织 • 上一篇    下一篇

依存句法特征的科研命名实体识别算法

赵华茗1, 钱力1,2, 余丽1   

  1. 1 中国科学院文献情报中心 北京 100190;
    2 中国科学院大学经济与管理学院图书情报与档案管理系 北京 100190
  • 收稿日期:2019-09-24 修回日期:2020-02-02 出版日期:2020-06-05 发布日期:2020-06-05
  • 作者简介:赵华茗(ORCID:0000-0002-8829-9208),副研究馆员,E-mail:zhaohm@mail.las.ac.cn;钱力(ORCID:0000-0002-0931-2882),副研究馆员;余丽(ORCID:0000-0002-4374-8743),馆员。
  • 基金资助:
    本文系中国科学院文献情报能力建设专项项目"文献情报'数据湖’及开放式大数据框架建设"(项目编号:院1852)与国家科技图书文献中心专项任务"多源数据增值与知识计算方法研究"(项目编号:K180201001)研究成果之一。

A Research Entity Recognition Algorithm Based on Dependency Parsing

Zhao Huaming1, Qian Li1,2, Yu Li1   

  1. 1 National Science Library, Chinese Academy of Sciences, Beijing 100190;
    2 Department of Library, Information and Archives Management, School of Economics and Management, University of Chinese Academy of Sciences, Beijing 100190
  • Received:2019-09-24 Revised:2020-02-02 Online:2020-06-05 Published:2020-06-05

摘要: [目的/意义] 探索科研命名实体及其关系的识别与抽取,提升其在长句等复杂情况下的识别效果,为进一步的应用提供参考与借鉴。[方法/过程] 以依存句法特征分析为基础,提出一种科研命名实体关系抽取方法,过程包括:①使用Standford Tagger工具对目标文本进行词性标注;②基于标注结果,围绕核心谓词和SAO结构,将目标文本分割为结构规范的语义片段;③通过依存句法分析,找出与核心谓词语义相关的主语和宾语,构成(实体,关系,实体)三元组。[结果/结论] 与Ollie、Reverb等主流算法进行的对比测试表明,该方法可以有效提升科研命名实体识别的准确性。

关键词: 依存句法分析, 科研命名实体, 实体识别, 关系抽取

Abstract: [Purpose/significance] To explore the recognition and extraction of research entities and their relationships, improve their recognition effect in complex situations such as long sentences, and provide reference for further application. [Method/process] Based on the analysis of dependency syntactic features, a method for recognizing and extracting research entity relations was proposed, which includes:POS tagging of the target text using Standford Tagger tool; based on annotation results, the target text was divided into semantic segments of structure specification around the core predicate and SAO structure; through dependency parsing, we can find out the subject and object related to the core predicate and form a triple of entities, relationships and entities. [Result/conclusion] This method is compared with Ollie and Reverb mainstream algorithm. Experiments show that this method can effectively improve the accuracy of scientific entity recognition.

Key words: dependency parsing, research entity, entity recognition, relation extraction

中图分类号: