Automatic Theory Recognition in Academic Journals Based on CRF

  • Chen Feng ,
  • Zhai Yujia ,
  • Wang Fang
  • Department of Information Resources Management, Business School of Nankai University, Tianjin 300071

Received date: 2015-10-25

  Revised date: 2015-12-14

  Online published: 2016-01-20


[Purpose/significance] Theory recognition in the academic journals is a precondition for content analysis, so the automation of theory recognition can improve the efficiency of content analysis. [Method/process] This paper regards theory recognition as named entity recognition, reviews the existing named entity recognition methods, and proposes a theory recognition model based on semantic generalization. Selecting the part of speech, HowNet semantic and other external knowledge, a series of experiments with CRF model on 1822 academic journal papers are conducted. [Result/conclusion] The accuracy rate of recognition is 95.38% high, but the recall rate is low;the size of the training texts has a large influence on the performance. Semantic resources can improve the performance, but the recall rate is decreased. How to select the semantic features, semantic annotation and semantic disambiguation has to be solved.

Chen Feng , Zhai Yujia , Wang Fang . Automatic Theory Recognition in Academic Journals Based on CRF[J]. Library and Information Service, 2016 , 60(2) : 122 -128 . DOI: 10.13266/j.issn.0252-3116.2016.02.019


