图书情报工作 ›› 2016, Vol. 60 ›› Issue (21): 122-127.DOI: 10.13266/j.issn.0252-3116.2016.21.016

• 知识组织 • 上一篇    下一篇

基于概念语义网络的词族挖掘研究

杜慧平   

  1. 上海师范大学信息管理系 上海 200234
  • 收稿日期:2016-06-15 修回日期:2016-10-14 出版日期:2016-11-05 发布日期:2016-11-05
  • 作者简介:杜慧平(ORCID:0000-0002-9150-2393),副研究馆员,博士,E-mail:dhp0420@163.com。
  • 基金资助:

    本文系国家社会科学基金一般项目“基于语义关联的数字档案资源跨媒体知识集成服务研究”(项目编号:14BTQ073)和上海师范大学校级项目“汉-英跨语言信息检索查询翻译消歧研究”(项目编号:A-3131-12-001002)研究成果之一。

Word Clusters Discovery Based on Semantic Network of Concepts

Du Huiping   

  1. Information Management Department of Shanghai Normal University, Shanghai 200234
  • Received:2016-06-15 Revised:2016-10-14 Online:2016-11-05 Published:2016-11-05

摘要:

[目的/意义] 提出一种新的词族识别方法,用于构建语义工具和辅助检索扩展,以降低编表专家的认知负担,提高语义工具构建和更新的效率。[方法/过程] 首先通过同现统计和相似度计算建立学科领域的概念语义网络,再利用社会网络分析中的Island算法进一步识别该网络中的词族。并以金融学科为例,比较该方法与层次聚类算法、“词素后方一致”方法识别词族的效果。[结果/结论] 结果发现,Island算法的效果优于层次聚类算法,并与“词素后方一致”方法各具优势,可以结合使用,取长补短。

关键词: 叙词表自动构建, 词族, 概念语义网络, 社会网络分析, 层次聚类方法, 词素后方一致

Abstract:

[Purpose/significance] This article proposes a new method to recognize word clusters which could be used for semantic tools construction and query expansion. This method can reduce the load of experts' cognition burden and promote the efficiency of generating and updating semantic tools. [Method/process] This paper used Island algorithm in the social network analysis to discover word clusters in a sematic network of concepts which was generated through the word co-occurrence analysis and sematic similarity computing. By taking the finance field as an example, this article compared the proposed method with the hierarchical clustering method and the "same morpheme" method. [Result/conclusion] It is discovered that Island algorithm is better than hierarchical clustering algorithms in word cluster recognition. Island algorithm and the "same morpheme" method have their own advantages, so they can be used in combination and complement each other.

Key words: automatic thesaurus construction, word clusters, semantic network of concepts, social network analysis, hierarchical clustering, the "same morpheme", method

中图分类号: