图书情报工作 ›› 2016, Vol. 60 ›› Issue (23): 135-142.DOI: 10.13266/j.issn.0252-3116.2016.23.017

• 知识组织 • 上一篇    下一篇

基于词频、词量、累积词频占比的共词分析词集范围选取方法研究

刘敏娟, 张学福, 颜蕴   

  1. 中国农业科学院农业信息研究所 北京 100081
  • 收稿日期:2016-08-09 修回日期:2016-11-15 出版日期:2016-12-05 发布日期:2016-12-05
  • 作者简介:刘敏娟(ORCID:0000-0001-8422-2919),馆员,博士;张学福(ORCID:0000-0002-9387-7527),室主任,研究员,博士,通讯作者,E-mail:zhangxuefu@caas.cn;颜蕴(ORCID:0000-0003-4181-7722),室主任,研究馆员。

Research on Method of Determining Scope of Word Set in Co-word Analysis Based on Word Frequency, Number of Words, Cumulative Word Frequency in Proportion

Liu Minjuan, Zhang Xuefu, Yan Yun   

  1. Agricultural Information Institute of Chinese Academy of Agricultural Sciences, Beijing 100081
  • Received:2016-08-09 Revised:2016-11-15 Online:2016-12-05 Published:2016-12-05

摘要:

[目的/意义]提出一种基于词频、词量、累积词频占比三者变化关系的共词分析词集范围的确定方法,尝试对现有词集范围选取方法中仅凭经验判断和过度依赖词频为“1”的关键词的问题进行改进,为相关研究提供一种更加规范、科学、值得借鉴的做法。[方法/过程]该方法充分考虑词集实际分布规律和特点,将词或词组分类成高、中、低频,并选择高、中频词共同作为共词分析的对象。[结果/结论]通过在具体领域的实例验证以及与其他方法的对比,证明该方法可以有效地选择合适的词集范围,对今后相关研究具有一定借鉴意义。

关键词: 共词分析, 词集范围, 词频分析, 高频词阈值

Abstract:

[Purpose/significance] This article proposesa method of determining scope of word set in co-word analysis based on word frequency, number of words, cumulative word frequency in proportion. It tries to solve the problems of existing research methods, including excessive dependence on judgment from experience and key words with frequency of "1" to provide a more standardized and scientific method and practice. [Method/process] This method fully considers the actual distribution characteristics of the word set and classifies the words or phrases into three areas:high, middle and low frequency area. At the same time, it chooses the high and middle frequency words as the object of co-word analysis. [Result/conclusion] This method isproved to be used to select the appropriate range of the word set effectively by comparing with other methods and empirical analysis in the specific field, which can be used for reference in the future.

Key words: co-word analysis, data set scope, word frequency analysis, threshold of high frequency word

中图分类号: