Study on Elimination Method of Ambiguous Words in Chinese Automatic Indexing

  • Wang Dan ,
  • Yang Xiaorong
Expand
  • 1. Institute of Agricultural Information, Chinese Academy of Agricultural Sciences, Beijing 100081;
    2. Key Laboratory of Agricultural Information Service Technology, Ministry of Agriculture, Beijing 100081

Received date: 2013-09-26

  Revised date: 2014-02-17

  Online published: 2014-03-05

Abstract

To achieve precise retrieval of massive information in network environment, firstly it is necessary to ensure that there are no ambiguous words in the literature indexing words. A lot of ambiguous words often are produced in Chinese automatic indexing process, and leads to retrieving irrelevant or missed information. This paper focuses on the related research on methods of eliminating the crossed meanings ambiguous words in the automatic indexing and puts forward a method to eliminating ambiguous words combined algorithm of exhaustive method and disambiguation rules. Experiments show that the method can avoid a great lot segmenting ambiguities with better segmenting results.

Cite this article

Wang Dan , Yang Xiaorong . Study on Elimination Method of Ambiguous Words in Chinese Automatic Indexing[J]. Library and Information Service, 2014 , 58(05) : 93 -97 . DOI: 10.13266/j.issn.0252-3116.2014.05.016

References

[1] 郑家恒, 吴芳芳. 多义型歧义字段切分研究[M].北京:清华大学出版社, 1999:129-134.
[2] 肖云, 孙茂松, 邹家彦. 利用上下文信息解决汉语自动分词中的组合型歧义[J].计算机工程与应用, 2001,37(19):87-81.
[3] 曲维光, 吉根林, 穗志方,等. 基于语境信息的组合型分词歧义消解方法[J].计算机工程, 2006,32(17):74-76.
[4] 冯素琴, 陈惠明. 一种自组织的汉语组合型歧义消歧方法[J].计算机工程与设计, 2007,28(3):737-749,742.
[5] Lafferty J, McCallum A, Pereira F. Conditional random fields: Probabilistic models for segmenting and labeling sequence data[C]//Proceeding of the 18th International Conference on Machine Learning.San Francisco:Morgan Kaufmann Publishers,2001:282-289.
[6] 李钝, 曹元大, 万月亮.基于关联规则的安全特色关键词提取研究[J].计算机工程与应用, 2006(S1):105-107.
[7] 肖红, 许少华.基于词汇同现模型的关键词自动提取方法研究[J].沈阳理工大学学报, 2009(5):38-41.
[8] 苏新宁, 刘晓清, 邵品洪.论中文标题的单字标引与位置检索[J].南京大学学报, 1990, 26(2):329-333.
[9] 魏博诚, 王爱平, 沙先军, 等.一种消除中文分词中交集型歧义的方法[J].计算机技术与发展 2011,21(5):60-63.
[10] 李斌.基于语料库的高频最大交集型歧义字段考察[J].中文信息学报, 2006,20(1):1-6.
[11] 杨芳, 杨振山.一种消除中文匹配中交集型歧义的方法[J].计算机辅助工程, 2005,4(2):36-38.
[12] 翁宏伟.中文信息处理中歧义及歧义自动识别方法比较[J].语言应用研究,2006(12):93-94.
[13] 李国臣, 刘开瑛, 张永奎.汉语自动分词及歧义组合结构的处理[J].中文信息学报, 1988, 2(3):27-32.
[14] 姚继伟, 赵东范.基于短语匹配的中文分词消歧方法[J].吉林大学学报(理学版), 2010, 48(3):427-432.
[15] 白栓虎.汉语词切分及词性标注一体化方法[C]//陈力为, 袁琦.计算语言学进展与应用.北京:清华大学出版社, 1995:56-61.
[16] 蔡捷.《中国图书馆分类法》专业分类表系列——《农业专业分类法》[M].北京:北京图书馆出版社, 1999.

Outlines

/