图书情报工作 ›› 2020, Vol. 64 ›› Issue (11): 77-86.DOI: 10.13266/j.issn.0252-3116.2020.11.009

• 情报研究 • 上一篇    下一篇

基于超网络的微博相似度及其在微博舆情主题发现中的应用

梁晓贺, 田儒雅, 吴蕾, 张学福   

  1. 中国农业科学院农业信息研究所 北京 100081
  • 收稿日期:2019-01-04 修回日期:2020-01-21 出版日期:2020-06-05 发布日期:2020-06-05
  • 作者简介:梁晓贺(ORCID:0000-0003-2005-3401),助理研究员,博士,E-mail:liangxiaohe@caas.cn;田儒雅(ORCID:0000-0002-9944-2081),助理研究员,博士;吴蕾(ORCID:0000-0003-0514-2203),助理研究员,博士;张学福(ORCID:0000-0002-9387-7527),研究员,博士生导师。
  • 基金资助:
    本文系中国农业科学院科技创新工程项目"科技情报分析与评估创新团队"(项目编号:CAAS-ASTIP-2016-AII)和中国农业科学院农业信息研究所基本科研业务费项目"基于加权策略的大数据微博突发舆情主题挖掘"(项目编号:JBYW-AII-2017-29)研究成果之一。

Microblog Similarity Based on Super Network and Its Application in Microblog Public Opinion Topic Detection

Liang Xiaohe, Tian Ruya, Wu Lei, Zhang Xuefu   

  1. Agricultural Information Institute of Chinese Academy of Agricultural Sciences, Beijing 100081
  • Received:2019-01-04 Revised:2020-01-21 Online:2020-06-05 Published:2020-06-05

摘要: [目的/意义] 准确地计算微博相似度可以提高微博主题挖掘效率,对舆情治理、保障信息安全具有实践意义。针对微博文本语义稀疏、高维的问题,提出一种融入微博非文本特征的超边相似度算法。[方法/过程] 分析微博舆情发生机制,利用超网络模型表示微博舆情主题形成过程,通过计算各层子网相似度及各层子网对主题形成的贡献度构建超边相似度算法。[结果/结论] 研究发现,论文所提出的相似度方法有助于提升微博舆情信息的主题聚类效果,特别是对于文字性表述相似程度高的微博信息,具有明显的主题区分性。

关键词: 超边相似度, 主题发现, 超网络, 微博

Abstract: [Purpose/significance] Accurate calculation of microblog similarity can improve the efficiency of microblog topic mining, and has practical significance for public opinion governance and information security. Aiming at the problem of sparse and high-dimensional microblog text, this paper proposes a super-edge similarity algorithm incorporating non-text features of microblog. [Method/process] The mechanism of microblog public opinion was analyzed, and the formation of microblog public opinion topic formation were expressed by super network model, and the algorithm of super-edge similarity was constructed by calculating the similarity of each subnet layer and the contribution of each subnet layer to the topic formation. [Result/conclusion] It was found that the similarity method proposed in this paper is helpful to improve the topic clustering effect of microblog public opinion information. Especially for micro blog with high similarity of literal expression, it has obvious subject differentiation.

Key words: super-edge similarity, topic detection, super network, microblog

中图分类号: