Study on Spammer Detection Based on Reviewer-Specific Characteristics

  • Nie Hui ,
  • Wu Yijun
Expand
  • School of Information Management, Sun Yat-Sen University, Guangzhou 510275

Received date: 2015-03-13

  Revised date: 2015-05-02

  Online published: 2015-05-20

Abstract

[Purpose/significance]This paper mainly studies the problem of review spammer detection based on specific characteristics. It tends to reveal the characteristics and behavioral regularities of Water Army on the web, build a simple and reasonable explanation prediction model of identifying review spammers, to build a foundation for the deep reviews mining research.[Method/process]By integrating empirical analysis and Machine Learning (ML) techniques, it explores the inner evaluation strategies of the target website. Factor analysis is employed to extract the features and behavior-specific factors of reviewers, on the basis of which a Logistic Regression model is built to identify review spammer.[Result/conclusion]On the dataset built from the target website, the classification accuracy for spammer identification can achieve to 73.8%,and AUC measure is 80.9%. Additionally, three feature factors related with reviewers, contribution, activity and word- specific literacy are proved to be significantly associated with the identification of reviewer, while the opposite result is get for reviewers' level, emotion and rating deviation. Basically, it gets the same results by using ML based methods and empirical analysis and the reasonable explanation can be given for our prediction model.

Cite this article

Nie Hui , Wu Yijun . Study on Spammer Detection Based on Reviewer-Specific Characteristics[J]. Library and Information Service, 2015 , 59(10) : 102 -109 . DOI: 10.13266/j.issn.0252-3116.2015.10.015

References

[1] 胡磊. 国外互联网信息可信性研究发展历程及特征分析[J]. 中国图书馆学报,2012(2):100-105.
[2] Jindal N,Liu Bing. Review spam detection[EB/OL]. [2015-05-01] http://www2007.wwwconference.org/posters/poster930.pdf.
[3] Jindal N, Liu Bing. Opinion spam and analysis[EB/OL].[2015-05-01] http://www.cs.uic.edu/~liub/FBS/opinion-spam-WSDM-08.pdf.
[4] Jindal N, Liu Bing, Lim E P. Finding unusual review patterns using unexpected rules[EB/OL].[2015-05-01]. http://www.cs.uic.edu/~liub/publications/CIKM-final-unexpected.pdf.
[5] Mukherjee A, Liu Bing, et al.Spotting fake reviewer groups in consumer reviews[EB/OL].[2015-05-01] http://www.cs.uic.edu/~liub/publications/WWW-2012-group-spam-camera-final.pdf.
[6] Ott M, Choi J Y, Cardie C, et al. Finding deceptive opinion spam by any stretch of the imagination[EB/OL].[2015-05-01]. http://homes.cs.washington.edu/~yejin/Papers/acl11_deception.pdf.
[7] Li Fangtao, Huang Minlie, Yang Yi, et al. Learning to Identify Review Spam[EB/OL].[2015-05-01].http://ijcai.org/papers11/Papers/IJCAI11-414.pdf.
[8] Jindal N,Liu Bing. Analyzing and detecting review spam[EB/OL].[2015-05-01]. http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=4470288.
[9] Mukherjee A, Venkataraman V.What Yelp Fake review filter might be doing?[EB/OL].[2015-05-01]. http://www.cs.uic.edu/~liub/publications/ICWSM-2013-Arjun-Spam.pdf.
[10] Lim E P, Nguyen A V,Jindal N, et al. Detecting product review spammers Using rating behaviors[EB/OL].[2015-05-01]. http://www.cs.uic.edu/~liub/publications/cikm-2010-final-spam.pdf.
[11] Ghose A, Ipeirotis P G. Estimating the helpfulness and economic impact of product reviews: Mining text and reviewer characteristics [J]. IEEE Transactions on Knowledge and Data Engineering, 2011, 23(10): 1498-1512.

Outlines

/