Automatic Identification of News Intent Based on Analyzing Query Features

  • Zhang Xiaojuan ,
  • Lu Wei ,
  • Lei Shengwei
  • 1. School of Computer and Information Science, Southwest University, Chongqing 400715;
    2. Centre for Studies of Information Resources, Wuhan University, Wuhan 430072

Received date: 2014-07-10

  Revised date: 2014-09-05

  Online published: 2014-10-30


This paper selects sample queries from Sogou query log, and makes these queries labeled by humans. Based on the analysis of the labeled news queries, we propose three novel features for news intent prediction, including query expression, a query distribution over time and clicked results. Finally, we apply the decision tree method to perform the task of automatic identification of news queries. Finally, experimental results show that: (1) Goals of news query are supposed to obtain information for a particular topic or some entertainment information, and search topics of news queries tend to be entertainment, economy, politics and sports. (2)Compared with non-news queries, new queries are likely to have named entities, larger fluctuation in the query distribution over time, and higher degree of similarity among clicked results. (3) Encouraging results of news identification are achieved, and the precision, recall, F-score for the query classification are 0.76、0.73 and 0.74, respectively.

Zhang Xiaojuan , Lu Wei , Lei Shengwei . Automatic Identification of News Intent Based on Analyzing Query Features[J]. Library and Information Service, 2014 , 58(20) : 82 -90 . DOI: 10.13266/j.issn.0252-3116.2014.20.013


