收稿日期: 2016-01-12
修回日期: 2016-02-06
网络出版日期: 2016-02-20
基金资助
本文系国家社会科学基金项目"社交媒体突发公共事件的协同应急机制研究"(项目编号:14CXW045)和教育部人文社会科学研究项目"微博突发公共事件传播路径的实时分析及其趋势预测"(项目编号:13YJC860013)研究成果之一。
A Deep Neural Network for Book Title Identification in Microblog
Received date: 2016-01-12
Revised date: 2016-02-06
Online published: 2016-02-20
朱娜娜 , 景东 , 薛涵 . 基于深度神经网络的微博图书名识别研究[J]. 图书情报工作, 2016 , 60(4) : 102 -106,141 . DOI: 10.13266/j.issn.0252-3116.2016.04.014
As a blooming social media platform, microblog has received extensive attention by the web users. The microblog data include massive user profile, user behavior and user generated content. Automatic identification of book title in microblog contributes to analysis of user interests and data mining of books. [Method/process] Based on the features of the microblog data, in this paper, we proposed a deep neural network approach to identify the book title in the microblog which is posted by users. [Result/conclusion] The experimental results show that the proposed approach significantly outperforms the traditional supervised learning approaches which are based on the feature engineering and the accuracy reaches 91.92%.
Key words: book title identification; neural network; deep learning; microblog
[1] MCCULLOCH W S, PITTS W. A logical calculus of the ideas immanent in nervous activity[J]. Bulletin of mathematical biophysics, 1943, 5(4):115-133.
[2] COLLOBERT R, WESTON J. A unified architecture for natural language processing: deep neural networks with multitask learning[C] // Proceedings of the 25th international conference on machine learning.Helsinki: ACM, 2008:160-167.
[3] MIKOLOV T, CHEN K. Efficient estimation of word representations in vector space[EB/OL].[2016-01-11]. http://iswc2011.semanticweb.org/fileadmin/iswc/Papers/Workshops/WeKEx/paper_6.pdf.
[4] RIZZO G,TRONCY R.NERD:evaluating named entity recognition tools in the Web of data[EB/OL].[2016-01-11].http://iswc2011.semanticweb.org/fileadmin/iswc/Papers/Workshops/WeKEx/paper_6.pdf.
[5] FININ T, MURNANE W, KARANDIKAR A. Annotating named entities in Twitter data with crowdsourcing[C] // Proceedings of the NAACL HLT 2010 workshop on creating speech and language data with Amazon's mechanical turk.Morristown:Association for Computational Linguistics, 2010:80-88.
[6] LIU X, ZHANG S, WEI F. Recognizing named entities in Tweets[C] //Proceedings of the 49th annual meeting of the Association for Computational Linguistics:human language technologies-Volume 1.Portland:Association for Computational Linguistics, 2011:359-367.
[7] RITTER A, CLARK S, MAUSAM A. Named entity recognition in Tweets: an experimental study[C] // Proceedings of the Conference on empirical methods in natural language processing.Edinburgh:Association for Computational Linguistics, 2011:1524-1534.
[8] LI C, WENG J, HE Q, et al.TwiNER: named entity recognition in targeted Twitter stream[C] //Proceedings of the 35th international ACM SIGIR conference on research and development in information retrieval.Portland:ACM, 2012:721-730.
[9] JIANG R, WANG T, TANG J. Named entity recognition for micro-blog[J]. Computer & digital engineering, 2014,42(4):647-651.
[10] TANG J, FANG Z, SUN J. Incorporating social context and domain knowledge for entity recognition[C] // International conference on World Wide Web. Florence: International World Wide Web Conferences Steering Committee, 2015:517-526.
[11] MIKOLOV T, SUTSKEVER I, CHEN K,et al. Distributed representations of words and phrases and their compositionality[J].Advances in neural information processing systems, 2013,26:3111-3119.
[12] CHANG C C, LIN C J. LIBSVM : a library for support vector machines[J]. ACM Transactions on intelligent systems and technology, 2011,2(3):389-396.
/
〈 | 〉 |