图书情报工作 ›› 2021, Vol. 65 ›› Issue (12): 93-100.DOI: 10.13266/j.issn.0252-3116.2021.12.009

• 研究论文 • 上一篇    下一篇

学术论文创新贡献句识别研究

罗卓然, 蔡乐, 钱佳佳, 陆伟   

  1. 武汉大学信息管理学院 武汉 430072
  • 收稿日期:2020-12-10 修回日期:2021-03-25 出版日期:2021-06-20 发布日期:2021-07-03
  • 作者简介:罗卓然(ORCID:0000-0003-0677-8350),博士研究生,E-mail:zoraluo@whu.edu.cn;蔡乐(ORCID:0000-0003-1278-4343),硕士研究生;钱佳佳(ORCID:0000-0002-6058-1287),硕士研究生;陆伟(ORCID:0000-0002-0929-7416),教授,博士生导师。
  • 基金资助:
    本文系国家社会科学基金重大项目"基于认知计算的学术论文评价理论与方法研究"(项目编号:17ZDA292)研究成果之一。

Research on the Recognition of Innovative Contribution Sentences of Academic Papers

Luo Zhuoran, Cai Le, Qian Jiajia, Lu Wei   

  1. School of Information Management, Wuhan University, Wuhan 430072
  • Received:2020-12-10 Revised:2021-03-25 Online:2021-06-20 Published:2021-07-03

摘要: [目的/意义] 学术论文贡献句是体现论文创新性和学术价值的重要形式。以学术论文全文本和MeSH主题词为数据基础,利用自然语言处理和深度学习技术,实现学术论文贡献句识别,为学术文本创新贡献内容的细粒度挖掘奠定基础,对实现基于认知计算的学术论文评价具有重要的理论和现实意义。[方法/过程] 首先,以PubMed论文全文本为数据来源,抽取论文Mesh主题词,对论文贡献句进行要素分析和特征提取。其次,采用半自动方式实现标注数据。最后,基于Albert深度学习模型实现贡献句的自动识别。[结果/结论] 通过数据一致性检验证明实验标注的训练数据的可信性,实验结果表明,相较于其他深度学习模型,训练的自动识别模型能够更有效识别学术论文中贡献句。

关键词: 贡献句, 学术论文, 创新性, Albert

Abstract: [Purpose/significance] Contribution sentences of academic papers are elements to reflect the novelty and academic value of papers. This study takes the full text of academic papers and MeSH terms as data sources and uses natural language processing and deep learning techniques to achieve academic paper contribution sentence recognition. This study lays the foundation for fine-grained mining of innovative contents of academic texts, which is important for realizing the evaluation of academic papers based on cognitive computing.[Method/process] Firstly, the full-text PubMed papers were used as the data source for element analysis and feature extraction of the contributed sentences. Secondly, a semi-automatic approach was used to fulfill the data annotation. Finally, the automatic recognition of contributed sentences was realized based on Albert deep learning model.[Result/conclusion] The plausibility of the experimentally labeled training data is proved by the data consistency test, and the experimental results show that the automatic recognition model trained in this paper can identify the contribution sentences in academic papers more effectively compared with other deep learning models.

Key words: contribution sentences, academic papers, novelty, Albert

中图分类号: