情报研究

基于Word2Vec和CNN的产品评论细粒度情感分析模型

  • 蔡庆平 ,
  • 马海群
展开
  • 1. 黑龙江大学信息管理学院 哈尔滨 150080;
    2. 黑龙江大学信息资源管理研究中心 哈尔滨 150080
蔡庆平(ORCID:0000-0002-1992-4081),讲师,博士研究生,E-mail:cqp-cqp@163.com;马海群(ORCID:0000-0002-2091-7620),教授,博士,博士生导师。

收稿日期: 2019-05-19

  修回日期: 2019-12-02

  网络出版日期: 2020-03-20

基金资助

本文系国家社会科学基金重点项目"开放数据与数据安全的政策协同研究"(项目编号:15ATQ008)和黑龙江省省属高等学校基本科研业务费基础研究项目"基于深度学习的产品评论信息情感分析研究"(项目编号:RWSKCX201809)研究成果之一。

A Fine-grained Sentiment Analysis Model for Product Reviews Based on Word2Vec and CNN

  • Cai Qingping ,
  • Ma Haiqun
Expand
  • 1. School of Information Management, Heilongjiang University, Harbin 150080;
    2. Research Center of Information Resource Management, Heilongjiang University, Harbin 150080

Received date: 2019-05-19

  Revised date: 2019-12-02

  Online published: 2020-03-20

摘要

[目的/意义] 构建一种基于Word2Vec和CNN的产品评论细粒度情感分析模型。[方法/过程] 首先使用Word2Vec从产品评论中构建产品特征词列表和噪声词表,其次借助噪声词表来进行产品评论特征词的提取,然后采用CNN对产品评论进行产品特征层面的细粒度情感分类,最后实现基于产品特征的产品评论聚类。[结果/结论] 通过爬取京东商城华为手机评论对该模型进行训练和测试,结果表明,该模型能够有效实现产品评论的细粒度情感分析,可以有效地发现用户对产品特征的关注度和满意度。

本文引用格式

蔡庆平 , 马海群 . 基于Word2Vec和CNN的产品评论细粒度情感分析模型[J]. 图书情报工作, 2020 , 64(6) : 49 -58 . DOI: 10.13266/j.issn.0252-3116.2020.06.007

Abstract

[Purpose/significance] To construct a fine-grained sentiment analysis model for product reviews based on Word2Vec and CNN.[Method/process] This paper firstly applied Word2vec to build product feature vocabulary and noise vocabulary based on product reviews, secondly extracted the feature words from product reviews by the noise vocabulary, then classified the product reviews according to product features sentiment, finally realized product reviews clustering based on product features.[Result/conclusion] The model was trained and tested by the reviews of Huawei mobile phone on JingDong Mall,the results showed that the model could effectively realize fine-grained sentiment analysis of product reviews and find out users focus and satisfaction on product features.

参考文献

[1] 王仁武,宋家怡,陈川宝.基于Word2Vec的情感分析在品牌认知中的应用研究[J].图书情报工作,2017, 61(22):6-12.
[2] VO D T,ZHANG Y.Don't count,predict! an automatic approach to learning sentiment lexious for short text[C]//YANNICK V, HAI Z, YUSUKE M. Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics.Stroundsburg:ACL,2016:219-224.
[3] TANG D Y,QIN B, ZHOU L J,et al.Domain-specific sentiment word extraction by seed expansion and pattern generation[J/OL].[2019-11-26]. http://arxiv.org/pdf/1309.6722.pdf.
[4] ZHENG L,WANG H,GAO S.Sentimental feature selection for sentiment analysis of Chinese online reviews[J].International journal of machine learning and cybernetics,2018,9(1):75-84.
[5] TITOV I,MCDONALD R.Modeling online reviews with multigrain topic models[C]//LI Z. Proceeding of the 17th international conference on World Wide Web. New York:ACM,2008:111-120.
[6] 孙艳,周学广,付伟.基于主题情感混合模型的无监督文本情感分析[J].北京大学学报(自然科学版),2013,49(1):102-108.
[7] KIM Y.Convolutional neural networks for sentence classification[C]//YUVAL M. Proceedings of the 2014 conference on empirical methods in natural language processing. Doha:ACL,2014:1746-1751.
[8] 李杰,李欢.基于深度学习的短文本评论产品特征提取及情感分类研究[J].情报理论与实践,2018(2):143-148.
[9] 余传明,陈雷,张小青.基于支持向量机的产品属性识别研究[J].情报学报,2010,29(6):1038-1044.
[10] 徐建民,王金花,马伟瑜.利用本体关联度改进的TF-IDF特征词提取方法[J].情报科学,2011,29(2):279-283.
[11] 夏天.词向量聚类加权TextRank的关键词提取[J].数据分析与知识发现,2017(2):28-33.
[12] 张柳,王昕巍,黄博,等.基于字词向量的多尺度卷积神经网络微博评论的情感分类模型及实验研究[J].图书情报工作,2019,63(18):99-108.
[13] 李跃鹏,金翠,及俊川.基于Word2Vec的关键字提取算法[J].科研信息化技术与应用,2015,6(4):54-59.
[14] 周顺先,蒋励,林霜巧,等.基于Word2Vector的文本特征化表示方法[J].重庆邮电大学学报(自然科学版),2018,30(2):272-279.
[15] 金志刚,胡博宏,张瑞.基于深度学习的多维特征微博情感分析[J].中南大学学报(自然科学版).2018(5):1135-1140.
[16] 赖文辉,乔宇鹏.基于词向量和卷积神经网络的垃圾短信识别方法[J].计算机应用,2018,38(9):2469-2476.
[17] MIKOLOV T,CHEN K,CORRADO G,et al.Efficient estimation of word representations in vector space[J/OL].[2019-11-26].http://arxiv.org/pdf/1301.3781v3.pdf.
[18] MIKOLOV T,SUTSKEVER L,CHEN K,et al.Distributed representations of words and phrases and their compositionality[C]//BURGES C J C, BOTTOU L, WELLING M. Proceedings of the 26th international conference on neural information processing systems. Lake Tahoe:NIPS,2013:3111-3119.
[19] 宁建飞,刘降真.融合Word2Vec与TextRank的关键词抽取研究[J].数据分析与知识发现,2016(6):20-27.
[20] 张剑,屈丹,李真.基于词向量特征的循环神经网络语言模型[J].模式识别与人工智能,2015(4):299-305.
[21] WU Y B,ZHANGg Q,HUANG X J,et al.Phrase dependency parsing for opinion mining[C]//DAVID C.Proceedings of the 2009 conference on empirical methods in natural language processing. Singapore:ACL,2009(3):1533-1541.
[22] 李志宇,梁循,周小平.基于属性主题分割的评论短文本词向量构建优化算法[J].中文信息学报,2016(9):101-110.
[23] 刘奇飞,沈炜域.基于Word2Vec和TextRank的时政类新闻关键词抽取方法研究[J].情报探索,2018(6):22-27.
[24] 徐晨,曹辉,赵晓.基于SVM的说话人识别参数选择方法[J].计算机工程,2012(11):175-177.
文章导航

/