图书情报工作 ›› 2018, Vol. 62 ›› Issue (5): 132-139.DOI: 10.13266/j.issn.0252-3116.2018.05.015

• 知识组织 • 上一篇    下一篇

基于多属性加权的社会化问答社区关键词提取方法

余本功1,2, 李婷1, 杨颖1   

  1. 1. 合肥工业大学管理学院 合肥 230009;
    2. 合肥工业大学过程优化与智能决策教育部重点实验室 合肥 230009
  • 收稿日期:2017-08-23 修回日期:2017-11-14 出版日期:2018-03-05 发布日期:2018-03-05
  • 作者简介:余本功(ORCID:0000-0003-4170-2335),教授,博士,E-mail:bgyu@hfut.edu.cn;李婷(ORCID:0000-0002-5556-7624),硕士研究生;杨颖(ORCID:0000-0002-9912-3443),副教授,博士。
  • 基金资助:
    本文系国家自然科学基金项目"基于制造大数据的产品研发知识集成与服务机制研究"(项目编号:71671057)和"不确定环境下的复杂产品研发协同绩效动态评价研究"(项目编号:71573071)研究成果之一。

Keywords Extraction Method for the Social Q&A Community Based on Multi-attributes Weighted

Yu Bengong1,2, Li Ting1, Yang Ying1   

  1. 1. School of Management, Hefei University of Technology, Hefei 230009;
    2. Key Laboratory of Process Optimization & Intelligent Decision-making, Ministry of Education, Hefei University of Technology, Hefei 230009
  • Received:2017-08-23 Revised:2017-11-14 Online:2018-03-05 Published:2018-03-05

摘要: [目的/意义]现有的关键词提取方法不适应社会化问答社区文本长度较短、内容表述口语化、数据集稀疏的特点,且很少考虑用户关注程度对词语重要性的影响,不能有效地提取此类文本的关键词,因此,提出针对社会化问答社区的多属性加权关键词提取方法。[方法/过程]多属性加权关键词提取方法通过引入调节函数和词性对传统TF-IDF进行改进,并通过线性加权融合用户回答数、关注数、浏览数以及评论数4个用户关注属性来综合度量词语权重。[结果/结论]实验表明,该方法能更有效地提取社会化问答社区文本的关键词。

关键词: 社会化问答社区, 关键词提取, TF-IDF, 多属性加权

Abstract: [Purpose/significance] Existing methods of extracting keywords can't be applied to the social Q&A community effectively, because they are not suitable for the characteristics of the social Q&A community which embodies short texts, colloquial contents and sparse data. They rarely think about the impact of users' attention on words. In view of the aforementioned problem, this paper presents a novel keywords extraction method based on multi-attributes weighted for the social Q&A community. [Method/process] This method improved the traditional TF-IDF algorithm by introducing the tuning function and the part of speech. Besides, it calculated the weight of words based on a linear weighting formula, which fused four attributes of user focus by dealing with numbers of users' answer, attention, browse, and comments. [Result/conclusion] Experiments show that this method can extract keywords from the social Q&A community more effectively.

Key words: social Q&A community, keyword extraction, TF-IDF, multi-attributes weighted

中图分类号: