收稿日期: 2013-04-07
修回日期: 2013-05-17
网络出版日期: 2013-06-05
A Literature Review on Pre-processing and Learning of Microtext
Received date: 2013-04-07
Revised date: 2013-05-17
Online published: 2013-06-05
王连喜 . 微博短文本预处理及学习研究综述[J]. 图书情报工作, 2013 , 57(11) : 125 -131 . DOI: 10.7536/j.issn.0252-3116.2013.11.023
As the features of microtext are sparse and highly redundant, the pre-processing and learning methods are the key problems of the data mining for microblog, and have a very important and wide application in many ways. The paper analyzes the characteristics of the microtext, and conducts an introduction and summarization to pre-processing and learning methods and their applications, including short text representation model, short text feature expanding and selection, classification and clustering for short text, hot events detection and automatic summarization, and so on. At last, this paper also proposes the limitations of the recent study, and points out the directions for future research.
