图书情报工作 ›› 2019, Vol. 63 ›› Issue (9): 95-100.DOI: 10.13266/j.issn.0252-3116.2019.09.010

• 知识组织 • 上一篇    下一篇

关键词共现网络视角下的学科基础词汇发现

于丰畅, 陆伟   

  1. 武汉大学信息管理学院 武汉 430072
  • 收稿日期:2018-06-20 修回日期:2018-11-27 出版日期:2019-05-05 发布日期:2019-05-05
  • 通讯作者: 陆伟(ORCID:0000-0002-0929-7416),教授,博士生导师,通讯作者,E-mail:weilu@whu.edu.cn
  • 作者简介:于丰畅(ORCID:0000-0002-6503-4688),博士研究生;陆伟(ORCID:0000-0002-0929-7416),教授,博士生导师,通讯作者,E-mail:weilu@whu.edu.cn。

The Discovery of Subject Basic Vocabulary from the Perspective of Keyword Co-occurrence Network

Yu Fengchang, Lu Wei   

  1. School of Information Management, Wuhan University, Wuhan 430072
  • Received:2018-06-20 Revised:2018-11-27 Online:2019-05-05 Published:2019-05-05

摘要: [目的/意义]学科基础词汇是学科知识的重要基石,对于理解学科的知识体系构成、理清学科的知识脉络以及促进学科教育都有重要的意义,但长期以来其主要依赖于人工总结,目前还未实现高效地在某学科范围内自动挖掘出学科基础词汇。[方法/过程]提出一种利用关键词共现网络发现学科内较为基础的词汇的方法。该方法利用基础词汇具有相对较低的词频和在网络中具有相对较高的中心度的特性,自动从学科关键词数据集中获得该学科的基础词汇。[结果/结论]利用ACM中1969年到2012年的论文集的计算机领域(全数据集)、user interfaces和information search and retrieval两个子主题的关键词数据集验证该方法的正确性,并且该方法能够使用较简单的步骤发现数据集中全局性的基础词汇。

关键词: 共现网络, Pagerank, 基础词汇

Abstract: [Purpose/significance] Subject basic vocabulary is an important cornerstone of subject knowledge. It is of great significance to understand the composition of the knowledge system of discipline, to clarify the knowledge context of discipline and to promote discipline education. However, for a long time, it mainly relies on manual summarization and cannot be automatically mined within a certain discipline.[Method/process] This paper proposes a method to use the keyword co-occurrence network to discover basic vocabularies within the discipline. This method takes advantage of the relatively low word frequency of the basic vocabulary and the relatively high degree of centrality in the network, and automatically obtains the subject basic vocabulary from the subject keyword dataset.[Result/conclusion] The validity of this method is verified by using the keyword datasets in the fields of computer(full dataset), user interfaces and information search and retrieval from ACM's 1969-2012 theses. Moreover, this method can use simpler steps to discover the global basic vocabulary in the data set.

Key words: co-occurrence network, pagerank, subject basic vocabulary

中图分类号: