图书情报工作 ›› 2022, Vol. 66 ›› Issue (17): 81-92.DOI: 10.13266/j.issn.0252-3116.2022.17.008

• 情报研究 • 上一篇    下一篇

基于引文网络的领域主题层次结构识别方法研究

王伟1,2, 梁继文1,2, 杨建林1,2   

  1. 1. 南京大学信息管理学院 南京 210023;
    2. 江苏省数据工程与知识服务重点实验室 南京 210023
  • 收稿日期:2022-02-25 修回日期:2022-05-09 出版日期:2022-09-05 发布日期:2022-09-09
  • 通讯作者: 杨建林,教授,博士,博士生导师,通信作者,E-mail:yangjl@nju.edu.cn
  • 作者简介:王伟,博士研究生;梁继文,博士研究生
  • 基金资助:
    本文系国家社会科学基金重点项目"大数据环境下领域知识加工与组织模式研究"(项目编号:20ATQ006)研究成果之一。

Research on the Identification Method of the Topic Hierarchical Structure in a Domain Based on the Citation Network

Wang Wei1,2, Liang Jiwen1,2, Yang Jianlin1,2   

  1. 1. School of Information Management, Nanjing University, Nanjing 210023;
    2. Jiangsu Key Laboratory of Data Engineering and Knowledge Service, Nanjing 210023
  • Received:2022-02-25 Revised:2022-05-09 Online:2022-09-05 Published:2022-09-09

摘要: [目的/意义]识别领域主题层次结构能够立体展现领域知识结构,为构建领域知识体系提供参考。[方法/过程]首先在目标数据集上建立语义加权的引文网络,并对引文网络进行预处理以提高社团内节点的语义相似性。其次利用社团层次划分算法识别引文网络的层次结构,以社团主题标签与核心文献说明社团主题。最后根据主题间语义关系调整主题位置后,通过可视化的方式展现主题层次结构和主题间知识流动。[结果/结论]以"知识管理 知识组织 知识服务"构成的领域为例说明了本研究提出的领域主题层次结构识别模式的有效性,在加权引用网络上识别出的117个社团经过调整后形成4层主题层次结构,主题树形图和主题关联图能够共同展现主题间并列、包含、重叠关系。这证实了基于复杂网络方法发现主题层次结构是领域知识组织的有效方式。

关键词: 知识结构, 主题层次结构, 社团发现, 引文网络, 领域知识组织

Abstract: [Purpose/Significance] Identifying the hierarchical structure of topics in a domain can display domain knowledge structure in three dimensions and provide a reference for constructing the domain knowledge system.[Method/Process] Firstly, a semantically weighted citation network was established on the target dataset. Then, the hierarchical structure of the citation network was identified by the hierarchical community detection algorithm. The community topic labels and core papers jointly illustrated the theme of communities. After adjusting the location of topics according to the semantic relationship between the topics, the hierarchical structure and the knowledge flow of topics were displayed by the visualization method.[Result/Conclusion] This study demonstrates the effectiveness of the identification method about hierarchical topic structure in the domain of knowledge management, knowledge organization and knowledge service. 117 communities are identified on the weighted citation network, and they have been adjusted to form a four-layer thematic hierarchical structure, and the topic tree diagram and the topic association graph can show the semantic relationships among topics, including parallel, inclusion and overlap. It is also proved that identifying hierarchical topic structure based on the complex network method is an effective way to organize domain knowledge.

Key words: knowledge structure, topic hierarchical structure, community detection, citation network, domain knowledge organization

中图分类号: