情报研究

多维特征下社会化问答社区答案排序研究

  • 易明 ,
  • 张婷婷 ,
  • 李梓
展开
  • 华中师范大学信息管理学院 武汉 430079
易明(ORCID:0000-0002-4864-6025),教授,博士生导师;张婷婷(ORCID:0000-0002-5068-8232),硕士研究生.

收稿日期: 2020-01-13

  修回日期: 2020-04-09

  网络出版日期: 2020-09-05

基金资助

本文系国家社会科学基金项目"基于人类动力学的信息网络信息交流行为研究"(项目编号:16BTQ076)和中央高校基本科研业务费重大培育项目"智慧图书馆系统关键技术与应用研究"(项目编号CCNU18JCXK04)研究成果之一。

Research on the Ranking of Social Q&A Community Answers Based on Multidimensional Features

  • Yi Ming ,
  • Zhang Tingting ,
  • Li Ziqi
Expand
  • School of Information Management, Central China Normal University, Wuhan 430079

Received date: 2020-01-13

  Revised date: 2020-04-09

  Online published: 2020-09-05

摘要

[目的/意义] 研究多维特征对社会化问答社区答案排序的影响,以提高问答社区服务质量并尽可能优化用户体验。[方法/过程] 从答案特征、回答者特征和投票者特征多个维度构建社会化问答社区答案排序特征体系,比较基于深度学习、树、神经网络、支持向量机等11种排序学习算法在问答社区数据集上的适用性,并训练随机森林分类算法,得到每个特征的重要程度。[结果/结论] 实验结果表明,基于深度学习的排序学习算法在NDCG@k和MRR指标上的性能均优于其他排序算法,投票者的影响力特征最为重要,其次是答案内容特征,最后是回答者的专业度特征,可以考虑从增加答案排序方式的多样性和提高答案排序算法的综合性两个维度进一步优化答案排序。

本文引用格式

易明 , 张婷婷 , 李梓 . 多维特征下社会化问答社区答案排序研究[J]. 图书情报工作, 2020 , 64(17) : 103 -113 . DOI: 10.13266/j.issn.0252-3116.2020.17.011

Abstract

[Purpose/significance] This paper studies the impact of multi-dimensional characteristics on Social Q&A Communities answer ranking, which can improve the service quality in Social Q&A Communities and optimize the user experience.[Method/process] This paper constructed a Social Q&A Communities answer ranking feature system from the answer feature, respondent feature and voter feature dimensions, and then we compared the applicability of 11 ranking learning algorithms based on deep learning, tree, neural network and support vector machine in Social Q&A Communities data set, and train random forest classification algorithm to get the importance of each feature.[Result/conclusion] The experimental results show that the sorting learning algorithm based on deep learning performs better than other sorting algorithms in NDCG@k and MRR indexes, and the influence characteristics of voters are very important, followed by the content characteristics of the answers, and finally the professional characteristics of the respondents. From the two dimensions of increasing the diversity of the answer ranking method and improving the comprehensiveness of the answer ranking algorithm, we provide some suggestions for the optimization of community answer ranking.

参考文献

[1] 李蕾,何大庆,章成志.社会化问答研究综述[J].数据分析与知识发现,2018,2(7):1-12.
[2] TOBA H, MING Z Y, ADRIANI M, et al. Discovering high quality answers in community question answering archives using a hierarchy of classifiers[J]. Information sciences, 2014,47(8):101-115.
[3] 张鹏飞. 面向在线问答社区的问题检索与回答抽取技术研究与实现[D].长沙:国防科学技术大学,2015.
[4] 田作辉. 非事实类问题的回答选取[D].哈尔滨:哈尔滨工业大学,2013.
[5] 袁健, 刘瑜. 基于混合式的社区问答回答质量评价模型[J]. 计算机应用研究,2017,34(6):1708-1712.
[6] SURDEANU M, CIARAMITA M, ZARAGOZA H. Learning to rank answers to non-factoid questions from Web collections[J]. Computational linguistics, 2012, 37(2):351-383.
[7] 郭顺利,张向先,陶兴,等.社会化问答社区用户生成答案质量自动化评价研究——以"知乎"为例[J].图书情报工作,2019,63(11):118-130.
[8] ZHOU Z M, LAN M, NIU Z Y, et al. Exploiting user profile information for answer ranking in CQA[C]//21st World Wide Web conference 2012.Lyon:ACM Press, 2012:767-774.
[9] 程亚男, 王宇. 基于语义情感相似度的问答社区回答排序研究[J].情报科学, 2018,36(8):72-76,83.
[10] 廉鑫. 社区问答系统中若干关键问题研究[D].天津:南开大学,2014.
[11] 刘瑜, 袁健. 基于RTEM模型的问答社区候选回答排序方法[J].电子科技, 2016, 29(5):130-134.
[12] 原立伟. 社区问答系统中回答排序迁移学习的方法研究[D].昆明:昆明理工大学,2017.
[13] LIU X, YE S, LI X, et al. ZhihuRank:a topic-sensitive expert finding algorithm in community question answering Websites[C]//Advances in Web-based learning-ICWL 2015. Guangzhou:Springer International Publishing, 2015:165-173.
[14] LI B, KING I, LYU M R. Question routing in community question answering:putting category in its place[C]//ACM conference on information and knowledge management. Glasgow:ACM Press, 2011:2041-2044.
[15] 罗毅,曹倩.基于RIPA方法的社会问答平台答案质量研究[J].图书情报工作,2015,59(3):126-133,25.
[16] 袁毅,杨莉.问答社区用户生成资源行为及影响因素分析——以百度知道为例[J].图书情报工作,2017,61(22):20-26.
[17] GEERTHIK S, RAJIV G K, VENKATRAMAN S. Respond rank:improving ranking of answers in community question answering[J]. International journal of electrical & computer engineering, 2016,6(4):1889-1896.
[18] 崔宇佳,张一迪,王培志,等.基于多评价标准融合的医疗数据特征选择算法[J].复旦学报(自然科学版),2019,58(2):250-255,268.
[19] SHEN Y, RONG W, SUN Z, et al. Question/answer matching for CQA system via combining lexical and sequential information[C]//29th AAAI conference on artificial intelligence. Austin:AAAI Press, 2015:275-281.
[20] ZHAO Z, LU H, ZHENG V W, et al. Community-based question answering via asymmetric multi-faceted ranking network learning[C]//Proceedings of the 31th AAAI conference on artificial intelligence. San Francisco:AAAI Press, 2017:3532-3539.
[21] YANG L, QIU M, GOTTIPATI S, et al. CQArank:jointly model topics and expertise in community question answering[C]//ACM international conference on information & knowledge management. San Francisco:ACM Press, 2013:99-108.
[22] JEON J, CROFT W B, LEE J H, et al. A framework to predict the quality of answers with non-textual features[C]//The 29th annual international ACM SIGIR conference on research and development in information retrieval. Washington:ACM Press,2006:228-235.
[23] 王乐. 社会化问答社区知识贡献和知识互动质量研究[D].哈尔滨:哈尔滨工业大学,2016.
[24] 王秀丽. 网络社区意见领袖影响机制研究——以社会化问答社区"知乎"为例[J].国际新闻界,2014,36(9):47-57.
[25] PERRY-SMITH J E, MANNUCCI P V. From creativity to innovation:the social network drivers of the four phases of the idea journey[J]. Academy of management review, 2017, 42(1):53-79.
[26] LIU T Y. Learning to rank for information retrieval[C]//International ACM SIGIR conference on research & development in information retrieval. Geneva:ACM Press. 2010:1-112.
[27] RADLINSKI F, JOACHIMS T. Query chains:learning to rank from implicit feedback[C]//ACM SIGKDD international conference on knowledge discovery & data mining. Chicago:ACM Press, 2005:239-248.
[28] RADLINSKI F, JOACHIMS T. Active exploration for learning rankings from clickthrough data[C]//ACM SIGKDD international conference on knowledge discovery and data mining. New York:ACM Press 2007:570-579.
[29] HOSSEINI M, MOORE J, ALMALIKI M, et al. Wisdom of the crowd within enterprises:practices and challenges[J]. Computer networks, 2015, 17(15):121-132.
[30] 熊李艳, 陈晓霞, 钟茂生,等. 基于PairWise排序学习算法研究综述[J].科学技术与工程,2017,17(21):184-190.
[31] JOACHIMS T. Training linear SVMs in linear time[C]//ACM SIGKDD international conference on knowledge discovery & data mining. Philadelphia:ACM Press, 2006:217-226.
[32] Sourceforge[EB/OL].[2020-05-08]. https://sourceforge.net/p/lemur/wiki/RankLib%20How%20to%20use/.
文章导航

/