

  • 曹洋 ,
  • 成颖 ,
  • 裴雷
  • 南京大学信息管理学院

收稿日期: 2014-07-24

  修回日期: 2014-08-22

  网络出版日期: 2014-09-20



A Review on Machine Learning Oriented Automatic Summarization

  • Cao Yang ,
  • Cheng Ying ,
  • Pei Lei
  • School of Information Management, Nanjing University, Nanjing 210093

Received date: 2014-07-24

  Revised date: 2014-08-22

  Online published: 2014-09-20



关键词: 自动文摘; 机器学习; NB; HMM; CRF


曹洋 , 成颖 , 裴雷 . 基于机器学习的自动文摘研究综述[J]. 图书情报工作, 2014 , 58(18) : 122 -130 . DOI: 10.13266/j.issn.0252-3116.2014.18.018


This paper probes into the process of automatic summarization based on machine learning, including features selection, algorithm selection, model training, abstracts extraction, model evaluation. The Review focuses on three main machine learning algorithms: Naive Bayes, Hidden Markov Model and Conditional Random Fields, mainly elaborating the idea of these algorithms, summarizing related research, and giving reflections. Then it discusses the common problems with three machine learning algorithms, including training methods, collaborative training and active learning, category balance, terms distribution. In the end, future research directions are explored.


