收稿日期: 2016-05-16
修回日期: 2016-08-26
网络出版日期: 2016-10-20
基金资助
本文系国家高技术研究发展计划(“863”计划)“微生物数字资源知识管理系统构建及关键技术研究”(项目编号:2014AA021503)和中国科学院2013年度“西部之光”人才培养计划“引文耦合网络演化分析及在科技评价与预测中的应用研究”(项目编号:科发人字165号(3-6))研究成果之一。
Community Detection Algorithm Based on Sample Weighting
Received date: 2016-05-16
Revised date: 2016-08-26
Online published: 2016-10-20
肖雪 , 王钊伟 , 陈云伟 , 邓勇 . 基于样本加权的引文网络的社团划分[J]. 图书情报工作, 2016 , 60(20) : 86 -93 . DOI: 10.13266/j.issn.0252-3116.2016.20.011
[Purpose/significance] The study of community discovery has great value for text mining. In order to improve the accuracy of the communities of citation networks, this paper describes a new community discovering algorithm for literature based on weighted networks. [Method/process] The algorithm was based on the "Louvain community detecting algorithm", and established the vector space model to calculate the similarity of the adjacent papers as the weight of the link. Finally, based on the weighted network, the authors detected the community structure of the network. [Result/conclusion] Experiments show that the proposed algorithm is an effective solution to improve the performance of community detection.
Key words: citation network; community discovery; clustering; text mining
[1] ROUSSEAU R. Concentration and diversity of availability and use in information systems:a positive reinforcement model[J]. Journal of the American Society for Information Science,1992,43(5):391-395.
[2] KERNIGHAN B W, LIN S. An efficient heuristic procedure for partitioning graphs[J]. Bell system technical journal, 1970, 49(2):291-307.
[3] BARNES E R. An algorithm for partitioning the nodes of a graph[J]. SIAM journal on algebraic discrete methods, 1982, 3(4):541-550.
[4] PAN J Y, YANG H J, Faloutsos C, et al. Automatic Multimedia Cross-modal Correlation Discovery[J]. Kdd, 2004:653-658.
[5] NEWMAN M E J. Fast algorithm for detecting community structure in networks[J]. Physical review e statistical nonlinear & soft matter physics, 2004, 69(6):066133.
[6] GIRVAN M, NEWMAN M E. Community structure in social and biological networks[J]. Proceedings of the National Academy of Sciences of the United States of America, 2002, 99(12):7821-7826.
[7] NEWMAN M E J. Detecting community structure in networks[J]. The European physical journal b-condensed matter and complex systems, 2004, 38(2):321-330.
[8] WANJANTUK P, KEANE P. Finding related documents via communities in the citation graph[J]. Communications and information technology,2004(1):445-450.
[9] SHIBATA N, KAJIKAWA Y, TAKEDA Y,et al. Detecting emerging research fronts based on topological measures in citation networks of scientic publications[J]. Technovation,2008, 28(11):758-775.
[10] TIAN Y, HANKINS R A, PATEL J M. Efficient aggregation for graph summarization[C]//ACM SIGMOD International Conference on Management of Data. Vancouver, 2008:567-580.
[11] CHENG H, ZHOU Y, YU J X. Clustering large attributed graphs:a balance between structural and attribute similarities[J].ACM transactions on knowledge discovery from data,2011,5(2):1-33.
[12] LIU Z,YU J X,CHENG H. Approximate homogeneous graph summarization[J]. Jonrnal of information processing, 2012,20(1):77-88.
[13] 张佳玉.基于节点相似度的社团发现算法研究[D].马鞍山:安徽工业大学,2014.
[14] 章成志, 师庆辉, 薛德军. 基于样本加权的文本聚类算法研究[J]. 情报学报, 2008, 27(1):42-48.
[15] LI J, GAO X, JIAO L. A novel typical-sample-weighted clustering algorithm for large data sets[C]//International conference on computational intelligence and security.Springer,2005:696-703.
[16] 刘勘, 周丽红, 陈譞. 基于关键词的科技文献聚类研究[J]. 图书情报工作, 2012, 56(4):6-11.
[17] ARENAS A, DUCH J, FERNANDEZ,et al. Size reduction of complex networks preserving modularity[J]. New journal of physics,2007,9(26):176.
[18] CLAUSET A, NEWMAN M E, MOORE C. Finding community structure in very large networks[J]. Physical review e statistical nonlinear & soft matter physics, 2004, 70(6):264-277.
[19] BLONDEL V D, Guillaume J L, Lambiotte R, et al. Fast unfolding of communities in large networks[J]. Journal of statistical mechanics:theory and experiment, 2008(10):155-168.
/
〈 |
|
〉 |