KNOWLEDGE ORGANIZATION

Integrating Word Semantic Representation and New Word Identification for Domain Ontology Evolution: A Case Study of Product Online Reviews

  • Geng Qian ,
  • Deng Siyu ,
  • Jin Jian
Expand
  • 1 Center for Governance Studies, Beijing Normal University, Zhuhai 519087;
    2 School of Government, Beijing Normal University, Beijing 100875

Received date: 2020-10-15

  Revised date: 2021-01-12

  Online published: 2021-06-02

Abstract

[Purpose/significance] Due to the inaccuracy and low efficiency in capturing new knowledge and new requirements in traditional ontology evolution, based on domain new word identification, an ontology evolution method is proposed and evaluated by analyzing a large volume of product online reviews.[Method/process] First, a series of natural language processing algorithms were used to pre-process product review text corpus, and the Word2vec algorithm was adopted for word vector embedding. Then, a Bi-LSTM-Attention-CRF algorithm was utilized for the recognition and extraction of new words in a candidate set, and the K-means algorithm was applied for clustering to get the final domain new words. Finally, the Six-Stage evolution process of ontology evolution was invited for analyzing domain ontology evolution.[Result/conclusion] By analyzing smart phone reviews as examples, it can be found that the proposed approach about new word identification presents a higher accuracy and recall rate and a new version of the product ontology in the smart phone domain can be evolved accordingly. It helps designers to optimize feature and function configuration in new product development and consumers to analyze online opinions for purchase decisions.

Cite this article

Geng Qian , Deng Siyu , Jin Jian . Integrating Word Semantic Representation and New Word Identification for Domain Ontology Evolution: A Case Study of Product Online Reviews[J]. Library and Information Service, 2021 , 65(8) : 85 -96 . DOI: 10.13266/j.issn.0252-3116.2021.08.009

References

[1] JIN J, LIU Y, JI P, et al. Review on recent advances in information mining from big consumer opinion data for product design[J]. Journal of computing and information science in engineering, 2019, 19(1):1-19.
[2] 邓斯予,耿骞,靳健,等. 基于产品评论分析的领域知识库构建与应用[J]. 情报理论与实践, 2019, 42(11):115-122,127.
[3] GENG Q, DENG S, JIA D, et al. Cross-domain ontology construction and alignment from online customer product reviews[J]. Information sciences, 2020, 531:47-67.
[4] CARDOSO S D, SILVEIRA M D, PRUSKI C. Construction and exploitation of an historical knowledge graph to deal with the evolution of ontologies[J]. Knowledge-based systems, 2020, 194(22):105508.
[5] 陈晶,刘钊,顾进广,等. 本体演化中基于TFOF的波及效应分析[J]. 武汉大学学报(理学版), 2020, 66(2):197-204.
[6] BENOMRANE S, SELLAMI Z, AYED M B. An ontologist feedback driven ontology evolution with an adaptive multi-agent system[J]. Advanced engineering informatics, 2016, 30(3):337-353.
[7] CHEN C, LIU Y, KUMAR M, et al. Energy consumption modelling using deep learning embedded semi-supervised learning[J]. Computers & industrial engineering, 2019, 135:757-765.
[8] LECUN Y, BENGIO Y, HINTON G. Deep learning[J]. Nature, 2015, 521(7553):436-444.
[9] NAGIREDDI V S K, MISHRA S. An ontology based cloud service generic search engine[C]//International conference on computer science & education. Colombo:IEEE, 2013:335-340.
[10] CHEN X, CHEN H, BI X, et al. BioTCM-SE:A semantic search engine for the information retrieval of modern biology and traditional Chinese medicine[J]. Computational and mathematical methods in medicine, 2014,13(2):1-13.
[11] 刘紫玉,杨雨佳,张晓明,等. 基于DBpedia的领域本体进化方法研究[J]. 情报杂志, 2017, 36(6):160-166.
[12] 陈晶,刘钊,顾进广,等. 本体演化的波及效应计算优化研究[J]. 计算机应用研究, 2020, 37(8):2366-2370.
[13] 刘毅,王宇,杨德礼. 本体进化驱动的个性化语义搜索研究[J]. 情报学报, 2015, 34(10):1048-1055.
[14] 刘莹. 基于本体进化和知识检索联动的知识管理系统[J]. 情报科学, 2016, 34(4):62-67.
[15] HUANG C, CAI H, XU L, et al. Data-driven ontology generation and evolution towards intelligent service in manufacturing systems[J]. Future generation computer systems, 2019, 101:197-207.
[16] 刘伟童,刘培玉,刘文锋,等. 基于互信息和邻接熵的新词发现算法[J]. 计算机应用研究, 2019, 36(5):1293-1296.
[17] 郭理,张恒旭,王嘉岐,等. 基于Trie树的词语左右熵和互信息新词发现算法[J]. 现代电子技术, 2020, 43(6):65-69.
[18] 王煜,徐建民. 用于网络新闻热点识别的热点新词发现[J/OL]. 计算机应用:1-9[2020-09-12]. http://kns.cnki.net/kcms/detail/51.1307.TP.20200722.1337.002.html.
[19] 杜丽萍,李晓戈,于根,等. 基于互信息改进算法的新词发现对中文分词系统改进[J]. 北京大学学报(自然科学版), 2016, 52(1):35-40.
[20] 周霜霜,徐金安,陈钰枫,等. 融合规则与统计的微博新词发现方法[J]. 计算机应用, 2017, 37(4):1044-1050.
[21] 王馨,王煜,王亮. 基于新词发现的网络新闻热点排名[J]. 图书情报工作, 2015, 59(6):68-74.
[22] 陈梅婕,谢振平,陈晓琪,等. 专利新词发现的双向聚合度特征提取新方法[J]. 计算机应用, 2020, 40(3):631-637.
[23] 张华平,商建云. 面向社会媒体的开放领域新词发现[J]. 中文信息学报, 2017, 31(3):55-61.
[24] 王汀,冀付军,徐天晟. 一种面向中文网络百科非结构化信息的知识获取方法[J]. 图书情报工作, 2016, 60(13):126-133.
[25] 陈先来,韩超鹏,安莹,等. 基于互信息和逻辑回归的新词发现[J]. 数据分析与知识发现, 2019(8):105-113.
[26] 刘昱彤,吴斌,谢韬,等. 基于古汉语语料的新词发现方法[J]. 中文信息学报, 2019, 33(1):46-55.
[27] 赵志滨,石玉鑫,李斌阳. 基于句法分析与词向量的领域新词发现方法[J]. 计算机科学, 2019, 46(6):29-34.
[28] 黄文明,杨柳青青,任冲. 结合信息量和深度学习的领域新词发现[J]. 计算机工程与设计, 2019, 40(7):1903-1907,1914.
[29] GREGOR K, DANIHELKA I, GRAVES A, et al. DRAW:a recurrent neural network for image generation[C]//ICML.15:proceedings of the 32nd international conference on international conference on machine learning. Lille:JMLR, 2015, 37:1462-1471.
[30] GRAVES A. Supervised sequence labelling with recurrent neural networks[M]//Studies in computational intelligence, SCI 385.Berlin:Springer, 2012:5-13.
[31] PALANGI H, DENG L, SHEN Y, et al. Deep sentence embed ding using long short-term memory networks:analysis and application to information retrieval[J]. IEEE/ACM transactions on audio, speech, and language processing, 2015, 24(4):694-707.
[32] BAHDANAU D, CHO K, BENGIO Y. Neural machine translation by jointly learning to align and translate[EB/OL].[2020-09-16]. https://arxiv.org/pdf/1409.0473.pdf.
[33] 张华丽,康晓东,李博,等. 结合注意力机制的Bi-LSTM-CRF中文电子病历命名实体识别[J]. 计算机应用, 2020,40(S1):98-102.
[34] 李纲,潘荣清,毛进,等. 整合BiLSTM-CRF网络和词典资源的中文电子病历实体识别[J]. 现代情报, 2020, 40(4):3-12,58.
[35] MIKOLOV T. Distributed representations of words and phrases and their compositionality[J]. Advances in neural information processing systems, 2013, 26:3111-3119.
[36] 胡甜甜,但雅波,胡杰,等. 基于注意力机制的Bi-LSTM结合CRF的新闻命名实体识别及其情感分类[J]. 计算机应用, 2020, 40(7):1879-1883.
[37] STOJANOVIC L, MAEDCHE A, MOTIK B, et al. User-driven ontology evolution management[C]//Proceedings of the 13th international conference on knowledge engineering and knowledge management. Ontologies and the semantic Web. Berlin:Springer-Verlag:2002,285-300.
[38] NOY N F, CHUGH A, LIU W, et al. A framework for ontology evolution in collaborative environments[C]//International semantic web conference. Berlin:Springer, 2006.
Outlines

/