Library and Information Service >
Research on Chinese Informational Webpage Classification Based on Query Intention
Received date: 2014-11-28
Revised date: 2014-12-20
Online published: 2015-01-05
[Purpose/significance] Webpage classification can help to improve the retrieval performance of the search engine and the content site, and it will precisely meet the needs of users to classify pages according to query intentions. [Method/process] This thesis selects Chinese informational pages as the research object, and firstly uses the method of artificial induction to construct the classification system for informational query intentions, then proposes the method of grouping pages based on the classification system above, which is verified by experiments. [Result/conclusion] Experimental results show that the method above is viablez, and has a certain reference value for improving the retrieval relevance and meeting users' real query intentions.
Wang Xiaoyan , Lin Changyi . Research on Chinese Informational Webpage Classification Based on Query Intention[J]. Library and Information Service, 2015 , 59(1) : 113 -118,126 . DOI: 10.13266/j.issn.0252-3116.2015.01.015
[1] 陆伟,周红霞,张晓娟.查询意图研究综述[J].中国图书馆学报,2013,39(1):100-111.
[2] 王大玲,于戈,鲍玉斌,等.基于用户搜索意图的Web网页动态泛化[J].软件学报,2010,21(5):1083-1087.
[3] Broder A.A taxonomy of Web search[J]. SIGIR Forum,2002,36(2): 3-10.
[4] Rose D E, Levinson D. Understanding user goals in Web search[C]//Proceeding of the 13th International Conference on World Wide Web. New York,: ACM, 2004:13-19.
[5] Chakrabarti S,Dom B,Indyk P. Enhanced hypertext categorization using hyperlinks[C]//Proceedings of ACM SIGMOD International Conference on Management of Data.New York: ACM,1998:307-318.
[6] Asirvatham A P,Ravi K K,Prakash A,et al. Web page classification based on document structure[EB/OL].[2014-11-28].http//citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.24.7710&rep=rep1&type=pdf.
[7] Cohen W W. Improving a page classifier with anchor extraction and link analysis[C]//Proceedings of Advances in Neural Information Processing Systems.Cambridge:MIT Press,2002:1481-1488.
[8] Kan M Y,Thi H O N. Fast webpage classification using URL features[C]//Proceedings of the 14th ACM International Conference on Information and Knowledge Management. New York:ACM,2005:325-326.
[9] Kovacevic M, Diligenti M, Gori M, et al. Recognition of common areas in a Web page using visual information: A possible application in a page classification[C]//Proceedings of 2002 IEEE International Conference on Data Mining(ICDM#02). Maebashi: IEEE Press, 2002:250-257.
[10] Shen Dou,Sun Jiantao,Yang Qiang,et al. A comparison of implicit and explicit links for webpage classification[C]//Proceedings of the 15th International Conference on World Wide Web. New York: ACM,2006:643-650.
[11] Elizabeth S B. Genre classification of Web documents[D]. Fort Collins: Colorado State University, 2005.
[12] 周帆.基于VSM的中文网页分类特征选择技术研究与实现[D].武汉:武汉理工大学,2012.
[13] 朱丽娜.中文网页分类特征提取方法研究[D].青岛:中国石油大学,2009.
[14] 朱珠.基于网页特征的中文网页自动分类[D].合肥:合肥工业大学,2009.
[15] 黄臻臻, 吴扬扬.中文网页体裁分类特征项的权值调整策略[J].广西师范大学学报: 自然科学版,2007,25(2):173-177.
[16] 时雷, 虎晓红, 席磊.基于集成学习的网页分类算法[J].郑州大学学报(理学版),2009,41(3):26-29.
[17] 庞观松,蒋盛益.文本自动分类技术研究综述[J].情报理论与实践,2012,35(2): 123-128.
[18] Yang Yiming, Liu Xin. A re-examination of text categorization method [C]//Proceedings of the 22nd ACM SIGIR Conference on Research and Development in Information Retrieval.New York: ACM, 1999: 42-49.
[19] Rossi F,Villa N.Support vector machine for functional data classification[J].Neurocomputing,2006,69(7):730-742.
/
〈 | 〉 |