Microblog Similarity Based on Super Network and Its Application in Microblog Public Opinion Topic Detection

  • Liang Xiaohe ,
  • Tian Ruya ,
  • Wu Lei ,
  • Zhang Xuefu
Expand
  • Agricultural Information Institute of Chinese Academy of Agricultural Sciences, Beijing 100081

Received date: 2019-01-04

  Revised date: 2020-01-21

  Online published: 2020-06-05

Abstract

[Purpose/significance] Accurate calculation of microblog similarity can improve the efficiency of microblog topic mining, and has practical significance for public opinion governance and information security. Aiming at the problem of sparse and high-dimensional microblog text, this paper proposes a super-edge similarity algorithm incorporating non-text features of microblog. [Method/process] The mechanism of microblog public opinion was analyzed, and the formation of microblog public opinion topic formation were expressed by super network model, and the algorithm of super-edge similarity was constructed by calculating the similarity of each subnet layer and the contribution of each subnet layer to the topic formation. [Result/conclusion] It was found that the similarity method proposed in this paper is helpful to improve the topic clustering effect of microblog public opinion information. Especially for micro blog with high similarity of literal expression, it has obvious subject differentiation.

Cite this article

Liang Xiaohe , Tian Ruya , Wu Lei , Zhang Xuefu . Microblog Similarity Based on Super Network and Its Application in Microblog Public Opinion Topic Detection[J]. Library and Information Service, 2020 , 64(11) : 77 -86 . DOI: 10.13266/j.issn.0252-3116.2020.11.009

References

[1] 李纲,徐伟,王馨平.基于事件要素的组合模型微博热点事件摘要提取[J].图书情报工作,2018,62(1):96-105.
[2] 梁晓贺,田儒雅,吴蕾,等.微博主题发现研究方法述评[J].图书情报工作,2017,61(17):41-48.
[3] 廖海涵,王曰芬,关鹏.微博舆情传播周期中不同传播者的主题挖掘与观点识别[J].图书情报工作,2018,62(19):77-85.
[4] 刘小敏,王昊,李心蕾,等.不同特征粒度在微博短文本分类中作用的比较研究[J].情报科学,2018,36(12):126-133.
[5] 彭敏,黄佳佳,朱佳晖. 基于频繁项集的海量短文本聚类与主题抽取[J].计算机研究与发展,2015,52(9):1941-1953.
[6] 崔金栋,孙遥遥,王欣,等. 基于Folksonmy和本体融合的微博信息推荐方法研究[J]. 情报科学,2015,33(10):27-31.
[7] ISLAM A, NKPEN D. Semantic text similarity using corpus-based word similarity and string similarity[J].ACM transactions on konwledge discovery from data,2008,2(2):1-235.
[8] MA H, DI L, ZENG X, et al. Short text feature extension based on improved frequent term sets[M]. New York:Springer International Publishing, 2016:169-178.
[9] WEN H, WANG Z, WANG H, et al. Short text understanding through lexical-semantic analysis[C]//Proceedings of the 31st IEEE international conference on data engineering. Seoul:IEEE Computer Society, 2015:495-506.
[10] 黄贤英,陈红阳,刘英涛.短文本相似度研究及其在微博话题检测中的应用[J].计算机工程与设计,2015,36(11):3128-3133.
[11] 李吉,黄微,郭苏琳.一种基于相似度和信任度融合的微博内容推荐方法[J].图书情报工作,2018,62(11):112-119.
[12] KRISHNAMURTHY B, GIL P, ARLITT M. A few chirps about twitter[C]//WOSP'08 Proceedings of the first workshop on online social networks. Seattle:Association for Computing Machinery, 2008:19-24.
[13] 逯鹏,张姗姗,高庆一.基于共同邻居的点权有限BBV模型研究[J].计算机科学,2014,41(4):49-52.
[14] 闫光辉,赵红运,任亚缙,等.基于时间特性的微博热门话题检测算法研究[J].计算机应用研究,2014,31(1):43-46.
[15] 吴方照,王丙坤,黄永峰.基于文本和社交语境的微博数据情感分类[J].清华大学学报(自然科学版),2014,54(10):1373-1376,1383.
[16] SHEFFI Y. Urban transportation networks:equilibrium analysis with mathematical programming methods[M]. Englewood Cliffs:Prentice-Hall,1985.
[17] DENNING P J. The science of computing:supernetworks[J]. American scientist, 1985,73(3):127-1269.
[18] NAGURNERY A, DONG J. Supernetworks:decision-making for the information age[M]. Cheltenham:Edward Elgar Publishing, 2002.
[19] 马军,董琼,杨德礼.时间敏感性产品供应链超网络均衡模型[J].系统管理学报,2015,24(4):610-616.
[20] BRICEO L, COMINETTI R,CORTES C E,et al. An integrated behavioral model of land use and transport system:a hyper-network equilibrium approach[J]. Networks and spatial economics, 2008,8(2/3):201-224.
[21] 朱莉,杜雅清.城市群应急资源协调调配的超网络模型[J].数学的实践与认识,2015,45(16):27-37.
[22] 曹霞,刘国巍.基于社会资本的产学研合作创新超网络分析[J].管理评论, 2013, 25(4):115-124,157.
[23] 田儒雅,孙巍,吴蕾,等.基于超图的图书情报领域知识合作特征分析[J].情报理论与实践,2016,39(10):25-30.
[24] 尚艳超,王恒山,王艳灵.基于微博上信息传播的超网络模型[J].技术与创新管理,2012, 33(2):175-179.
[25] 潘芳,鲍雨亭.基于超网络的微博反腐舆情研究[J].情报杂志,2014,33(8):173-177.
[26] 梁晓贺,田儒雅,吴蕾.基于超网络的微博舆情主题挖掘方法[J].情报理论与实践,2017,40(10):100-105.
[27] 马宁,刘怡君.基于超网络的舆情演化多主体建模[J].系统管理学报,2015,24(6):785-794,805.
[28] 张丽. 一种中文文本聚类方法的研究[D]. 哈尔滨:哈尔滨工程大学,2009.
[29] DAKKA W, GRAVANO L, IPRIROTIS P. Answering general time-sensitive queries[J]. IEEE transactions on knowledge and data engineering, 2012, 24(2):220-350.
[30] EFORN M, LIN J, HE J, et al. Temporal feedback for tweet search with non-parametric density estimation[C]//SIGIR'14:Proceedings of the 37th international ACM SIGIR conference on research and development in information retrieval. New York:ACM Press, 2014:33-42.
[31] LIN J, EFRON M. Temporal relevance profiles for tweet search[C]//SIGIR'13:Proceedings of the 36th international ACM SIGIR conference on research and development in information retrieval workshop on time-aware information access. Dublin:ACM Press, 2013. doi:10.1.1.420.611.
[32] SALTON G,WONG A, YANG C S. A vector space model for automatic indexing[J].Communications of the ACM,1975, 18(11):613-620.
[33] 李明德,蒙胜军,张宏邦.微博舆情传播模式研究——基于过程的分析[J].情报杂志.2014,33(2):120-127.
[34] 安璐,吴林.融合主题与情感特征的突发事件微博舆情演化分析[J].图书情报工作,2017,61(15):120-129.
[35] 徐琳宏,林鸿飞,潘宇,等.情感词汇本体的构造[J].情报学报,2008,27(2):180-185.
[36] 唐小波,兰玉婷.基于特征本体的微博产品评论情感分析[J].图书情报工作, 2016,60(16):121-127,136.
[37] 唐晓波, 房晓可. 基于文本聚类与LDA相融合的微博主题检索模型研究[J]. 情报理论与实践, 2013, 36(8):85-90.
[38] 孙昌年. 基于主题模型的文本相似度计算研究与实现[D].合肥:安徽大学, 2012.
[39] FAN F J, GOODMAN E D, LIU Z J. AHP (analytic hierarchy process) and computer analysis software used in tourism safety[J]. Journal of software,2013,8(12):3114.
[40] MARQUES J P, WU Y F, et al. Pattern recognition:concepts, methods and applications[M]. Beijing:Tsinghua University Press, 2002:67-72.
[41] 王建仁,马鑫,段刚龙.改进的K-means聚类值选择方法[J].算机工程与应用, 2019, 55(8):1-8.
[42] CHEN C L, TSENG F S C, LIANG T. Mining fuzzy frequent itemsets for hierarchical document clustering[J]. Information processing & management, 2010, 46(2):193-211.
Outlines

/