综述述评

自然语言语义分析研究进展

  • 秦春秀 ,
  • 祝婷 ,
  • 赵捧未 ,
  • 张毅
展开
  • 1. 西安电子科技大学经济与管理学院;
    2. 西安电子科技大学科学研究院
秦春秀,西安电子科技大学经济与管理学院副教授,博士研究生,E-mail:cxqin@xidian.edu.cn;祝婷,西安电子科技大学经济与管理学院硕士研究生;赵捧未,西安电子科技大学经济与管理学院教授,博士;张毅,西安电子科技大学科学研究院工程师,硕士.

收稿日期: 2014-07-24

  修回日期: 2014-10-27

  网络出版日期: 2014-11-20

基金资助

本文系国家自然科学基金项目“基于知识地图的对等网语义社区及其知识共享研究”(项目编号:71103138)和中央高校基本科研业务费资助项目“大数据背景下基于用户生成内容的商务智能模型研究”(项目编号:7214484902) 研究成果之一.

Research Review on Semantics Analysis of Natural Language

  • Qin Chunxiu ,
  • Zhu Ting ,
  • Zhao Pengwei ,
  • Zhang Yi
Expand
  • 1. School of Economics and Management, Xidian University, Xi'an 710071;
    2. School of Science Research, Xidian University, Xi'an 710070

Received date: 2014-07-24

  Revised date: 2014-10-27

  Online published: 2014-11-20

摘要

按照自然语言的构成层次——词语、句子和篇章,分析各层次语义分析的内涵、现有的研究策略、理论依据及存在的主要方法,并对现存的两类主要研究策略进行对比分析.认为词语语义分析是指确定词语意义,衡量两个词之间的语义相似度或相关度;句子语义分析研究包含句义分析和句义相似度分析两方面;文本语义分析就是识别文本的意义、主题、类别等语义信息的过程.当前的自然语言语义分析主要存在两种主要的研究策略:基于知识或语义学规则的语义分析和基于统计学的语义分析.基于统计与规则相融合的语义分析方法是未来自然语言语义分析的主流方法,本体语义学是自然语言语义分析的重要基础.

本文引用格式

秦春秀 , 祝婷 , 赵捧未 , 张毅 . 自然语言语义分析研究进展[J]. 图书情报工作, 2014 , 58(22) : 130 -137 . DOI: 10.13266/j.issn.0252-3116.2014.22.021

Abstract

According to the three composition levels of natural language-words, sentences and texts, their definitions, the existing research strategies, theoretical basis and the present main methods are summarized and analyzed. Furthermore, two existing research strategies of semantic analysis are analyzed comparatively. Word semantic analysis is defined as to determine words meaning and measure similarity or relevancy between two words; sentence semantic analysis research includes sentence semantics and sentences similarity analysis; text semantic analysis is defined as the process of identifying text meaning, topic and category etc. There are two main research strategies to make the semantic analysis of natural language which are semantic analysis based on the knowledge or semantic rules and the statistics. In addition, the semantic analysis method based on combination of the statistics and rule will be the future mainstream method in natural language semantic analysis. And ontology semantics will be the important basis for analysis of natural language semantics.

参考文献

[1] 齐璇,马红妹.汉语的语义分析研究[J].计算机工程与科学,2001,23(3): 89-92.

[2] Quillian M R.Semantic memory[M]//Minsky M Y. Semantic Information Processing.Cambridge: MIT Press, 1968.

[3] Sowa J F. Conceptual structures:Information processing in mind and machine[M]. Boston:Addison-Wesley Longman Publishing Co., Inc., 1984.

[4] Gruber T R. A translation approach to portable ontology specifications[J]. Knowledge Acquisition, 1993, 5(2): 199-220.

[5] 苗传江. HNC(概念层次网络理论)导引[M]. 北京:清华大学出版社,2005.

[6] Miller G A, Fellbaum C. Semantic network of English [M]//Levin B, Pinker S. Lexical & Conceptual Semantics. Amsterdam: Elsevier Science Publishers, 1991.

[7] Baker C F, Fillmore C J, Lowe J B. The Berkeley Framenet Project[C]//Proceedings of the 36th Annual Meeting of the Association for Computational Linguistics and 17th International Conference on Computational Linguistics-Volume 1.Stroudsburg: Association for Computational Linguistics, 1998: 86-90.

[8] Richardson S D, Dolan W B, Vanderwende L. MindNet:Acquiring and structuring semantic information from text[C]// Proceedings of the 17th International Conference on Computational Linguistics-Volume 2.Stroudsburg:Association for Computational Linguistics, 1998: 1098-1102.

[9] 董振东.语义关系的表达和知识系统的建造[J].语言文字应用, 1998 (3): 76-82.

[10] 梅家驹,竺一鸣,高蕴琦.同义词词林[M].上海:上海辞书出版社, 1983.

[11] 于江生,俞士汶.中文概念词典的结构[J].中文信息学报, 2002, 16 (4): 13-21.

[12] Rada R, Mili H, Bicknell E, et al. Development and application of a metric on semantic nets[J]. IEEE Transactions on Systems, Man and Cybernetics, 1989,19(1):17-30.

[13] Richardson R, Smeaton A F, Murphy J. Using WordNet as a knowledge base for measuring semantic similarity between words[R]. Technical Report Working PaperCA-1294, School of Computer Applications.Dublin: Dublin City University, 1994.

[14] Agirre E, Rigau G. A proposal for word sense disambiguation using conceptual distance [C]//International Conference/Re-cent Advances in Natural Language Processing RANLP.Bulgaria:John Benjamins, 1995: 91-98.

[15] 江敏,肖诗斌,王弘蔚,等. 一种改进的基于《知网》的词语语义相似度计算[J]. 中文信息学报, 2008(5):84-89.

[16] 张敏,王振辉,王艳丽.一种基于《知网》知识描述语言结构的词语相似度计算方法[J].计算机应用与软件,2013,30(7): 265-269.

[17] Li S, Zhang J, Huang X, et al. Semantic computation in a Chinese question-answering system [J]. Journal of Computer Science and Technology, 2002, 17(6): 933-939.

[18] 吴佐衍,王宇.基于HNC理论的词语相似度计算[J].中文信息学报,2014,28(2):37-43,50.

[19] 秦春秀,赵捧未,刘怀亮.词语相似度计算研究[J].情报理论与实践,2007,30(1):105-108.

[20] Miller G A, Fellbaum C. Semantic network of English[M]//Levin B, Pinker S. Lexical & Conceptual Semantics. Amsterdam: Elsevier Science Publishers, 1991.

[21] 胡俊峰,俞士汶.唐宋诗中词汇语义相似度的统计分析及应用[J].中文信息学报,2002(4):40-45.

[22] Lee L. Similarity-based approaches to natural language processing[EB/OL].[2014-06-11].http://www.cs.cornell.edu/home/llee/papers/thesis.home.html, 1997.

[23] Brown P. Word sense disambiguation using tactical methods[C]//Proceedings of 29th Meeting of the Association For Computational Linguistics (ACL291).Stroudsburg:Association For Computational Linguistics,1991: 201-207.

[24] 游博.词语语义相关度计算研究[D].武汉:华中师范大学, 2013.

[25] 李军辉.中文句法语义分析及其联合学习机制研究[D].苏州:苏州大学,2010.

[26] 穗志方,赵军.统计句法分析建模中基于信息论的特征类型分析[J].计算机学报,2001,24(2):144-151.

[27] Lucien T. Elements de syntax structural[M] . 2nded.Revue et corrigee. Paris:Klincksieck,1976.

[28] 车万翔. 基于核方法的语义角色标注研究 [D].哈尔滨:哈尔滨工业大学, 2008.

[29] 李世奇. 面向文景转换的中文浅层语义分析方法研究[D].哈尔滨:哈尔滨工业大学, 2011.

[30] Fillmore,C. J. The case for case[C]//Bach E, Harms R T. Universals in Linguistic Theory. New York:HOLT, Rinehart and Winston,1968:1-88.

[31] Schank R C. Identification of conceptualizations underlying natural language[J]. Computer Models of Thought and Language, 1973:187-247.

[32] 由丽萍,范开泰,刘开瑛.汉语语义分析模型研究评述[J].中文信息学报,2005,19(6):57-64.

[33] 李茹,王智强,李双红,等.基于框架语义分析的汉语句子相似度计算[J].计算机研究与发展,2013,50(8):1728-1736.

[34] 吴佐衍,王宇.基于HNC理论和依存句法的句子相似度计算[J].计算机工程与应用,2014,50(3):97-103.

[35] 李素建.基于语义计算的语句相关度研究[J].计算机工程与应用,2002,38(7):75-76,83.

[36] 唐琦.基于语义分析的句子相似度计算方法[D].北京:华北电力大学,2008.

[37] 史燕.基于 HNC的汉语句子相似度算法的研究[D].镇江: 江苏大学,2009.

[38] 李彬,刘挺,秦兵,等.基于语义依存的汉语句子相似度计算[J].计算机应用研究,2003(12):15-18.

[39] 李春梅,徐庆生.基于多特征的汉语句子相似度计算模型的研究[J].计算机技术与发展,2014,24(6):136-141.

[40] Dumais S, Furnas G, Landauer T, et al.Using latent semantic analysis to improve access to textual information[C]//Proceedings of Computer Human Interaction. Washington: ACM,1988:281-285.

[41] Hofmann T. Probabilistic latent semantic indexing[C]//Proceedings of the 22th Annual International SIGIR Conference on Research and Development in Information Retrieval. Berkeley: Association Computing Machinery,1999: 50-57.

[42] Blei D M, Ng A Y, Jordan M I. Latent Dirichlet allocation[J]. Journal of Machine Learning Research,2003,3(4/5) :993-1022.

[43] 宣云干.基于潜在语义分析的社会化标注系统标签语义检索研究[D].南京:南京大学, 2011.

[44] 宋晓雷,王素格,李红霞,等.基于概率潜在语义分析的词汇情感倾向判别[J].中文信息学报,2011,25(2):89-94.

[45] 单斌,李芳. 基于LDA话题演化研究方法综述[J].中文信息学报, 2010, 24(6): 43-49.

[46] Fillmore C J. Frames and the semantics of understanding[J]. Quaderni di Semantica, 1985, 6(2): 222-254.

[47] Nirenburg S, Raskin V. Ontological semantics[M].Cambridge:MIT Press, 2004.

[48] 崔晓菊, 易绵竹. 面向文本语义自动分析的本体语义学述要[J].解放军外国语学院学报, 2013 (2): 39-43.

文章导航

/