INFORMATION RESEARCH

Research on the Identification Method of the Topic Hierarchical Structure in a Domain Based on the Citation Network

  • Wang Wei ,
  • Liang Jiwen ,
  • Yang Jianlin
Expand
  • 1. School of Information Management, Nanjing University, Nanjing 210023;
    2. Jiangsu Key Laboratory of Data Engineering and Knowledge Service, Nanjing 210023

Received date: 2022-02-25

  Revised date: 2022-05-09

  Online published: 2022-09-09

Abstract

[Purpose/Significance] Identifying the hierarchical structure of topics in a domain can display domain knowledge structure in three dimensions and provide a reference for constructing the domain knowledge system.[Method/Process] Firstly, a semantically weighted citation network was established on the target dataset. Then, the hierarchical structure of the citation network was identified by the hierarchical community detection algorithm. The community topic labels and core papers jointly illustrated the theme of communities. After adjusting the location of topics according to the semantic relationship between the topics, the hierarchical structure and the knowledge flow of topics were displayed by the visualization method.[Result/Conclusion] This study demonstrates the effectiveness of the identification method about hierarchical topic structure in the domain of knowledge management, knowledge organization and knowledge service. 117 communities are identified on the weighted citation network, and they have been adjusted to form a four-layer thematic hierarchical structure, and the topic tree diagram and the topic association graph can show the semantic relationships among topics, including parallel, inclusion and overlap. It is also proved that identifying hierarchical topic structure based on the complex network method is an effective way to organize domain knowledge.

Cite this article

Wang Wei , Liang Jiwen , Yang Jianlin . Research on the Identification Method of the Topic Hierarchical Structure in a Domain Based on the Citation Network[J]. Library and Information Service, 2022 , 66(17) : 81 -92 . DOI: 10.13266/j.issn.0252-3116.2022.17.008

References

[1] 滕广青,田依林,董立丽,等.知识组织体系的解构与重构[J].情报理论与实践, 2011, 34(9):15-18.
[2] AHN Y Y, BAGROW J P, LEHMANN S. Link communities reveal multiscale complexity in networks[J]. Nature, 2010, 466(7307):761-764.
[3] 马文峰,杜小勇.关于知识组织体系的若干理论问题[J].中国图书馆学报, 2007,33(2):13-17,46.
[4] 叶春蕾,冷伏海.基于共词分析的学科主题演化方法改进研究[J].情报理论与实践, 2012, 35(3):79-82.
[5] GARFIELD E. Historiographic mapping of knowledge domains literature[J]. Journal of information science, 2004, 30(2):119-145.
[6] SMALL H, SWEENEY E, GREENLEE E. Clustering the science citation index using co-citations. II. mapping science[J]. Scientometrics, 1985, 8(5/6):321-340.
[7] KESSLER M M. Bibliographic coupling between scientific papers[J]. American documentation, 1963, 14(1):10-25.
[8] WHITE H D, MCCAIN K W. Visualizing a discipline:an author co-citation analysis of information science, 1972-1995[J]. Journal of the american society for information science, 1998, 49(4):327-355.
[9] YANG S, WANG F. Visualizing information science:author direct citation analysis in China and around the world[J]. Journal of informetrics, 2015, 9(1):208-225.
[10] FIGUEROLA C G, MARCO F J G, PINTO M. Mapping the evolution of library and information science (1978-2014) using topic modeling on LISA[J]. Scientometrics, 2017, 112(3):1507-1535.
[11] 关鹏,王曰芬.科技情报分析中LDA主题模型最优主题数确定方法研究[J].现代图书情报技术, 2016(9):42-50.
[12] 邱科达,马建玲.基于文本语料的上下位关系识别研究综述[J].情报科学, 2020, 38(7):162-172.
[13] ZHANG C, TAO F, CHEN X, et al. Taxogen:unsupervised topic taxonomy construction by adaptive term embedding and clustering[C]//Proceedings of the 24th ACM SIGKDD international conference on knowledge discovery & data mining. New York:Association for Computing Machinery,2018:2701-2709.
[14] YAN B N, LEE T S, LEE T P. Mapping the intellectual structure of the internet of things (iot) field (2000-2014):a co-word analysis[J]. Scientometrics, 2015, 105(2):1285-1300.
[15] 张琳.基于期刊聚类的科学结构研究[D].大连:大连理工大学,2010.
[16] 卢小莉,吴登生.基于期刊作者耦合的学科结构识别研究[J].情报学报,2020,39(11):1154-1161.
[17] 邱均平,董克.作者共现网络的科学研究结构揭示能力比较研究[J].中国图书馆学报,2014, 40(1):15-24.
[18] 李佳.共词聚类分析法中的主要问题与对策[J].情报学报, 2010,29(4):614-617.
[19] BLEI D M, GRIFFITHS T L, JORDAN M I, et al. Hierarchical topic models and the nested Chinese restaurant process[C]//NIPS.Cambridge:MIT Press,2003:17-24.
[20] 王平.基于层次概率主题模型的科技文献主题发现及演化[J].图书情报工作,2014,58(22):70-77.
[21] 陈静,徐波,王甜甜,等.基于hLDA的图书内部主题层次组织研究[J].图书情报工作,2016, 60(18):140-148.
[22] XU Y, YIN J, HUANG J, et al. Hierarchical topic modeling with automatic knowledge mining[J]. Expert systems with applications, 2018, 103:106-117.
[23] MENG Y, ZHANG Y, HUANG J, et al. Hierarchical topic mining via joint spherical tree and text embedding[C]//Proceedings of the 26th ACM SIGKDD international conference on knowledge discovery & data mining. New York:Association for Computing Machinery,2020:1908-1917.
[24] 刘雅姝,滕广青,王东艳,等.基于知识网络的标签层级演化研究[J].图书情报工作,2017,61(11):13-20.
[25] XIAO L, CHEN G, SUN J, et al. Exploring the topic hierarchy of digital library research in China using keyword networks:a k-core decomposition approach[J]. Scientometrics, 2016, 108(3):1085-1101.
[26] 胡昌平,陈果.领域知识网络的层次结构与微观形态探证——基于k-core层次划分的共词分析方法[J].情报学报,2014, 33(2):130-139.
[27] 巴志超,刘学太,梁镇涛.技术的知识网络层次结构及其知识复杂度测度方法研究[J].情报理论与实践,2021,44(3):178-187.
[28] CLAUSET A, MOORE C, NEWMAN M E J. Hierarchical structure and the prediction of missing links in networks[J]. Nature, 2008, 453(7191):98-101.
[29] SALES-PARDO M, GUIMERA R, MOREIRA A A, et al. Extracting the hierarchical organization of complex systems[J]. Proceedings of the national academy of sciences, 2007, 104(39):15224-15229.
[30] BLONDEL V D, GUILLAUME J L, LAMBIOTTE R, et al. Fast unfolding of communities in large networks[J]. Journal of statistical mechanics:theory and experiment, 2008, 2008(10):P10008.
[31] ROSVALL M, BERGSTROM C T. Multilevel compression of random walks on networks reveals hierarchical organization in large integrated systems[J]. PloS one, 2011, 6(4):e18209.
[32] SHIBATA N, KAJIKAWA Y, TAKEDA Y, et al. Comparative study on methods of detecting research fronts using different types of citation[J]. Journal of the American Society for Information Science and Technology, 2009, 60(3):571-580.
[33] BOYACK K W, KLAVANS R. Co-citation analysis, bibliographic coupling, and direct citation:which citation approach represents the research front most accurately?[J]. Journal of the American Society for Information Science and Technology, 2010, 61(12):2389-2404.
[34] RADICCHI F, CASTELLANO C, CECCONI F, et al. Defining and identifying communities in networks[J]. Proceedings of the National Academy of Sciences, 2004, 101(9):2658-2663.
[35] DING Y. Community detection:topological vs. topical[J]. Journal of informetrics, 2011, 5(4):498-514.
[36] YANG J, MCAULEY J, LESKOVEC J. Community detection in networks with node attributes[C]//2013 IEEE 13th international conference on data mining. Dallas:IEEE, 2013:1151-1156.
[37] GHIASIFARD S, KHADIVI S, ASADPOUR M, et al. Improving the quality of overlapping community detection through link addition based on topic similarity[C]//2015 the international symposium on artificial intelligence and signal processing. Mashhad:IEEE, 2015:182-187.
[38] ROSVALL M, BERGSTROM C T. Maps of random walks on complex networks reveal community structure[J]. Proceedings of the national academy of sciences, 2008, 105(4):1118-1123.
[39] 周晓英.信息构建与知识构建[J].情报理论与实践,2005,28(4):352-354.
[40] 王晰巍,靖继鹏,赵晋.知识构建对知识管理的优化研究[J].情报科学,2007,25(7):972-978.
[41] 苏新宁.大数据时代数字图书馆面临的机遇和挑战[J].中国图书馆学报,2015,41(6):4-12.
[42] 戴俊,朱小梅,盛昭瀚.知识转化的机理研究[J].科研管理,2004,25(6):85-91.
[43] 赵蓉英,刘卓著,王君领.知识转化模型SECI的再思考及改进[J].情报杂志,2020,39(11):173-180.
[44] 谭大鹏,霍国庆,王能元,等.知识转移及其相关概念辨析[J].图书情报工作,2005,49(2):13-16,149.
[45] 张同建,王华,王邦兆.个体层面知识转化、知识转移和知识共享辨析[J].情报理论与实践, 2014,37(9):44-47.
[46] 张庆普,李志超.企业隐性知识流动与转化研究[J].中国软科学,2003(1):88-92.
[47] 李智杰,曾文,乔晓东.知识组织系统构建技术研究[J].情报理论与实践,2017, 40(1):115-120.
[48] 周志超,盖双双.国内知识元研究的缘起与发展脉络[J].情报科学,2019,37(10):158-163.
[49] 温有奎.基于"知识元"的知识组织与检索[J].计算机工程与应用,2005,41(1):55-57,91.
Outlines

/