[Purpose/significance] This paper proposes a new hierarchical discovery method of scientific knowledge structure, which provides reference for optimizing knowledge structure discovery process and improving knowledge organization form.[Method/process] Firstly, this paper constructed a hierarchical discovery method of scientific knowledge structure by using LDA topic model. Then, according to the average similarity degree among topics, it automatically determined the hierarchy of knowledge structure, and the literature subsets were intersected by filtering threshold automatically in the "document-topic" probability matrix. Finally, it adopted tree diagram to display the science knowledge structure and explore the correlation and inheritance of knowledge points. Besides, we also compared our method with HLDA method which is a hierarchical topic model.[Result/conclusion] The result shows that the knowledge structure obtained by our method is better, the representation of knowledge topic is stronger and it has the higher operation efficiency. In addition, compared with the HLDA method, our method has a great improvement on the topic differences of the single layer and the topic inheritance between layers.
Li Hui
,
Tian Yadan
. A Hierarchical Discovery Method of Scientific Knowledge Structure[J]. Library and Information Service, 2018
, 62(13)
: 92
-102
.
DOI: 10.13266/j.issn.0252-3116.2018.13.012
[1] 杜柏兰. 论档案人员知识结构的构建原则[J].北京档案,1998(7):28-29.
[2] 张振东. 情报人员"木"型学科知识结构[J].情报学刊,1991(6):475-478.
[3] CHARVET F F,COOPER M C,GARDNER J T.The intellectual structure of supply chain management:a bibliometric approach[J].Journal of business logistics, 2008,29(1):47-73.
[4] 仲秋雁,曲刚. 知识管理学科知识流派划分及发展趋势研究[J].情报科学,2011(1):11-18.
[5] 宋歌. SNA与MSA在揭示知识结构中的比较研究[J].图书情报工作, 2009, 53(8):106-109.
[6] 彭希羡,朱庆华,沈超. 基于社会网络分析的社会计算领域的作者合作分析[J].情报杂志,2013(3):93-100.
[7] ZHOU Z. Social network analysis of high cited authors based on domestic mapping knowledge domains[J].Journal of modern information, 2012,32(8):97-100.
[8] RORISSA A, YUAN X. Visualizing and mapping the intellectual structure of information retrieval[J].Information processing & management, 2012, 48(1):120-135.
[9] 李琬,孙斌栋.西方经济地理学的知识结构与研究热点——基于CiteSpace的图谱量化研究[J].经济地理. 2014,34(4):7-12.
[10] RAVIKUMAR S, AGRAHARI A, SINGH S N. Mapping the intellectual structure of scientometrics:a co-word analysis of the journal scientometrics (2005-2010)[J]. Scientometrics, 2015, 102(1):929-955.
[11] KHASSEH A A, SOHEILI F, MOGHADDAM H S, et al. Intellectual structure of knowledge in iMetrics[J].Information processing & management an international journal, 2017, 53(3):705-720.
[12] BLEI D M, NG A Y, JORDAN M I. Latent dirichlet allocation[J].Journal of machine learning research, 2003(3):993-1022.
[13] 王曰芬, 傅柱, 陈必坤. 采用LDA主题模型的国内知识流研究结构探讨:以学科分类主题抽取为视角[J].现代图书情报技术, 2016, 32(4):8-19.
[14] CHANG H C. The synergy of scientometric analysis and knowledge mapping with topic models:modelling the development trajectories of information security and cyber-security research[J].Journal of information & knowledge management, 2016, 15(4):77-84.
[15] BLEI D M,GRIFFITHS T L,JORDAN M I,et al.Hierarchical topic models and the nested Chinese restaurant process[J].Advances in neural information processing systems,2004,57(2):18-22.
[16] 陈静,徐波,王甜甜,等.基于hLDA的图书内部主题层次组织研究[J].图书情报工作,2016,60(18):140-148.
[17] 陈亮,张静,张海超,等.层次主题模型在技术演化分析上的应用研究[J].图书情报工作,2017,61(5):103-108.
[18] 张怡,邵裕东,张加万.多源媒体文本主题演变的可视分析[J].计算机辅助设计与图形学学报,2017,29(12):2265-2272.
[19] 上海林原信息科技有限公司.HanLP[EB/OL].[2017-08-10].http://www.hanlp.linrunsoft.com/.
[20] The stanford natural language processing group.Stanford CoreNLP[EB/OL].[2017-07-10].http://www.nlp.stanford.edu/software/corenlp.shtml.
[21] 王鹏, 高铖, 陈晓美. 基于LDA模型的文本聚类研究[J].情报科学, 2015(1):63-68.
[22] 衡伟, 于佳, 李蕾,等. 应用hLDA进行多文档主题建模关键因素研究[J].中文信息学报, 2013, 27(6):117-128.
[23] BLEI D M, GRIFFITHS T L, JORDAN M I. The nested Chinese restaurant process and Bayesian inference of topic hierarchies[C]//International conference on neural information processing systems. Cambridge:MIT Press, 2012:17-24.
[24] ERL T, MAHMOOD Z, PUTTINI R,等. 云计算:概念、技术与架构[M].北京:机械工业出版社,2014.
[25] 朱敏. 云计算国内外研究现状综述[J].电脑知识与技术, 2015, 11(17):52-53.
[26] 王建冬, 刘洋, 王继民. 国内云计算研究领域核心作者群知识结构及演化路径分析[J].北京大学学报(自然科学版), 2013,49(5):773-782.
[27] 李慧宗,周姣,王向前,等. 融合社会关系的用户标签主题模型[J].情报杂志,2017,36(3):165-172.
[28] 陈敏. 多模态语义知识库构造方法研究[D].武汉:华中科技大学,2014.