北京大学信息管理系成立75周年学术专辑

大数据环境下知识融合技术体系研究

  • 陈沫 ,
  • 李广建
展开
  • 北京大学信息管理系 北京 100081
陈沫,博士研究生

收稿日期: 2022-07-29

  修回日期: 2022-08-26

  网络出版日期: 2022-11-17

基金资助

本文系国家社会科学基金重大项目"大数据时代知识融合的体系架构、实现模式及实证研究"(项目编号:15ZDB129)研究成果之一。

Research on the Knowledge Fusion Technology Taxonomy in Big Data Environment

  • Chen Mo ,
  • Li Guangjian
Expand
  • Department of Information Management, Peking University, Beijing 100871

Received date: 2022-07-29

  Revised date: 2022-08-26

  Online published: 2022-11-17

摘要

[目的/意义] 研究在大数据环境下多源知识融合的关键技术,结合不同领域多源知识对象的特点构建出一整套技术体系,为知识融合的落地实现提供技术支撑和解决方案。[方法/过程] 运用定性分析法对现有的相关研究进行分析,再对相关内容进行归纳和演绎,利用文献分析法,梳理知识融合要解决的问题,归纳总结知识融合的任务类型以及实现各种任务所需的工作流程及其涉及的具体技术,形成知识融合技术体系。[结果/结论] 综合考虑各种技术的自身特性、适用的知识对象、应用的抽象层次,建立具有计算层、功能层和任务层三个层次的技术体系架构。这三个层次相互联系、相互影响、环环相扣,向上可以抽象,可以与知识融合的具体问题(任务)相关联;向下可以具化,即找到解决知识融合具体问题的可操作、可计算的技术方法。

本文引用格式

陈沫 , 李广建 . 大数据环境下知识融合技术体系研究[J]. 图书情报工作, 2022 , 66(20) : 20 -31 . DOI: 10.13266/j.issn.0252-3116.2022.20.003

Abstract

[Purpose/Significance] This paper mainly studies the key technologies of multi-source knowledge fusion in the big data environment, and proposes a complete set of technology taxonomy based on the characteristics of multi-source knowledge objects in different fields to provide technical support and solutions for the realization of knowledge fusion. [Method/Process] The study utilized qualitative analysis method to analyze the existing related research, and then within the same hierarchical level on the related contents of induction and deduction, used literature analysis method, to solve the problem of combing knowledge fusion, sum up the knowledge integration of task type and implement various tasks involved in the work process and its specific technology, and form a knowledge fusion system. [Result/Conclusion] Considering all kinds of technology's own characteristics, applicable knowledge objects, the application of abstraction of knowledge, the paper establishes the calculating layer, function layer and mission layer-three levels of technical architecture. These three layers contact each other, influence each other and interlock each other. The upper layer can be abstracted and associated with specific problems (tasks) of knowledge fusion. The lower layer can be embodied, that is, to find operational and computable technical methods to solve specific problems of knowledge fusion.

参考文献

[1] 董小英.知识优势的理论基础与战略选择[J].北京大学学报(哲学社会科学版),2004(4):37-45.
[2] YU X, LIN Q. Knowledge fusion methods: a survey[C]// Proceedings of the 2017 2nd international conference on software, multimedia and communication engineering. Colorado: DEStech Publications, 2017: 300-304.
[3] 邱均平,余厚强.知识科学视角下国际知识融合研究进展与趋势[J].图书情报工作,2015,59(8):126-132,148.
[4] 缑锦.知识融合中若干关键技术研究[D].杭州:浙江大学,2005.
[5] 姜振寰,吴明泰,王海山,等.技术学辞典[M].沈阳:辽宁科学技术出版社,1990.
[6] 杨开城,王斌.从技术的本质看教育技术的本质[J].中国电化教育,2007(9):1-4.
[7] 徐敏.知识计算视角下知识融合技术的模式及方法研究[D].北京:北京大学,2019.
[8] DONG X L, GABRILOVICH E, HEITZ G, et al. From data fusion to knowledge fusion[J]. arXiv preprint arXiv: 1503. 00302, 2015.[2022-09-22]. https://arxiv.org/pdf/1503.00302v1.pdf.
[9] ROEMERr M J, KACPRZYNSKI G J, ORSAGH R F. Assessment of data and knowledge fusion strategies for prognostics and health management[C]//2001 IEEE aerospace conference proceedings. New York: IEEE, 2001: 2979-2988.
[10] PREECE A, HUI K, GRAY A, et al. KRAFT: an agent architecture for knowledge fusion[J]. International journal of cooperative information systems, 2001, 10(1/2): 171-195.
[11] TSUKASA I, TAKENAKA T, MOTOMURA Y. Customer behavior prediction system by large scale data fusion in a retail service[J]. Transactions of the Japanese Society for Artificial Intelligence, 2011, 26(6): 670-681.
[12] FENG G, ZHANG J D, LIAO S S. A novel method for combining Bayesian networks, theoretical analysis, and its applications[J]. Pattern recognition, 2014, 47(5): 2057-2069.
[13] 周芳,韩立岩.基于知识融合的公司失败判别方法[J].财会通讯,2015(8):61-63.
[14] SAHA R K, CHANG K C. An efficient algorithm for multisensor track fusion[J]. IEEE transactions on aerospace and electronic systems, 1998, 34(1):200-210.
[15] FENG G, ZHANG J D, LIAO S S. A novel method for combining Bayesian networks, theoretical analysis, and its applications[J]. Pattern recognition, 2014, 47(5): 2057-2069.
[16] MARTENS D, DEBAKER M, HAESEN R, et al. Ant-based approach to the knowledge fusion problem [C]// International workshop on ant colony optimization and swarm intelligence. Berlin: Springer, 2006: 84-95.
[17] 马冯.基于扩展概念格的多数据源分类知识融合问题研究[D].合肥:合肥工业大学,2006.
[18] DONG X, GABRILOVICH E, HEITZ G, et al. Knowledge vault: a web-scale approach to probabilistic knowledge fusion[C]//Proceedings of the 20th ACM SIGKDD international conference on knowledge discovery and data mining. New York: ACM, 2014: 601-610.
[19] TAI C H, CHANG C T, CHANG Y S. Hybrid knowledge fusion and inference on cloud environment [J]. Future generation computer systems, 2018, 87(1): 568-579.
[20] GRUBER T R. Ontolingua: a mechanism to support portable ontologies [R]. California: Knowledge Systems Laboratory, 1992.
[21] GOU J, JIANG Y, WU Y, et al. A New knowledge fusion method based on semantic rules[C]// International conference on signal processing. New York: IEEE, 2006: 1939-1942.
[22] 张灵凯,于良.多源遥感数据融合研究综述[J].城市地理,2017(2):173.
[23] 张孝飞,王强,韦春荣,等.医学图像融合技术研究综述[J].广西科学,2002(1):64-68.
[24] KARPATHY A, LI F F. Deep visual-semantic alignments for generating image descriptions[J]. IEEE transactions on pattern analysis and machine intelligence, 2017, 9(4): 664-676.
[25] CHAVEZ P S, SILDES S C, ANDERSON J A. Comparison of three different methods to merge multiresolution and multispectral data: landsat TM and SPOT panchromatic[J]. Photogrammetric engineering & remote sensing, 1991, 57(3): 295-303.
[26] LI H, MANJUNATH B S, MITRA S K. Multisensor image fusion using the wavelet transform. graph[J]. Models image process, 1995, 57(3): 235-245.
[27] 李琦,孙桂玲,黄翠,等.基于水声环境空间中多模态深度融合模型的目标识别方法研究[J].海洋技术学报.2019,38(6):35-45.
[28] 胡郁.人工智能与语音识别技术[J].电子产品世界,2016,23(4):23-25,27.
[29] BRAVO A, CASES M, QUERALT-ROSINACH N, et al. A knowledge-driven approach to extract disease-related biomarkers from the literature[R]. BioMed research international, 2014: 1-11.
[30] BRAVO À, PINERO J, QUERALT-ROSINACHh N. et al. Extraction of relations between genes and diseases from text and large-scale data analysis: implications for translational research[J]. BMC bioinformatics, 2015, 16(1): 1-17.
[31] 陈建美,林鸿飞.中文情感常识知识库的构建[J].情报学报,2009,28(4):492-498.
[32] 龚安,费凡.基于多特征融合的评论文本情感分析[J].计算机技术与发展,2018(8):91-95.
[33] 曾镇,吕学强,李卓.一种面向专利摘要的领域术语抽取方法[J].计算机应用与软件,2016,33(3):48-51.
[34] HOBBS J R. Granularity[C] //Proceedings of the ninth intenational joint conference on artificial intelligence. Los Angeles: Morgan Kaufmann, 1985: 432-435.
[35] 丁梦晓,毕强,许鹏程,等.基于用户兴趣度量的知识发现服务精准推荐[J].图书情报工作, 2019,63(3):21-29.
[36] 翟东升,郭程,张杰,等.采用异常检测的技术机会识别方法研究[J].现代图书情技术,2016(10):81-90.
[37] 周磊,杨威.基于加权关联规则的技术融合探测[J].情报杂志,2019,38(1):67-72.
[38] 朱惠,王昊,苏新宁,等.汉语领域术语非分类关系抽取方法研究[J].情报学报,2018,37(12):1193-1203.
[39] KEJRIWAL M, SZEKELY P, KNOBLOCK C. Investigative knowledge discovery for combating illicit activities[J]. IEEE intelligent systems, 2018 (1): 53-63.
[40] 陆雄文.管理学大辞典[M].上海:上海辞书出版社,2013.
[41] LENAT D B, PRAKASH M, SHEPTHRD M. CYC: Using common sense knowledge to overcome brittleness and knowledge acquisition bottlenecks[J]. AI magazine, 1985, 6(4): 65-65.
[42] 刘建炜,燕路峰.知识表示方法比较[J].计算机系统应用,2011,20(3):242-246.
[43] 黄德根,张云霞,林红梅,等.基于规则推理网络的分类模型[J].软件学报,2020,31(4):1063-1078.
[44] CHEN Y W, YANG J B, XU D L, et al. Inference analysis and adaptive training for belief rule based systems[J]. Expert systems with applications, 2011, 38(10): 12845-12860.
[45] YAN R, LI G, LIU B. Knowledge fusion based on DS theory and its application on expert system for software fault diagnosis[C]//2015 prognostics and system health management conference. New York: IEEE, 2015: 1-5.
[46] PARTESCANO E, BROSICH A, LIPIZER M, et al. From heterogeneous marine sensors to sensor Web: (near) real-time open data access adopting OGC sensor Web enablement standards[J]. Open geospatial data, software and standards, 2017, 2(1): 1-9.
[47] SHETH A, HENSON C, SAHOO S S. Semantic sensor Web[J]. IEEE Internet computing, 2008, 12(4): 78-83.
[48] LI X, MA S, ZHOU X. Large-scale Chinese cross-document entity disambiguation and information fusion[C]// Advancing big data benchmarks. Berlin: Springer, 2013: 105-119.
[49] BRONSELAER A, VAN Britsom D, DE T G. A framework for multiset merging[J]. Fuzzy sets and systems, 2012, 191(1): 1-20.
[50] YUE L, SHI Z, HAN J, et al. Multi-factors based sentence ordering for cross-document fusion from multimodal content[J]. Neurocomputing, 2017, 253(1): 6-14.
[51] WITTE R, BERGLER S. Fuzzy clustering for topic analysis and summarization of document collections[C]//Conference of the canadian society for computational studies of intelligence. Berlin: Springer, 2007: 476-488.
[52] LEBANOFF L, WANG B, FENG Z, et al. Modeling endorsement for multi-document abstractive summarization[J]. arXiv preprint. arXiv: 2110. 07844, 2021.
[53] 倪景秀.图像语义融合关键技术的研究[D].北京:中国矿业大学,2018.
[54] SMIRNOV A V, LEVASHOVA T, SHILOV N. Knowledge fusion in context-aware decision support systems[C]// Proceedings of the international joint conference on knowledge discovery, knowledge engineering and knowledge management. Rome: KEOD, 2014: 186-194.
[55] 张磊.具有模糊不确定性的应急决策知识融合方法研究[D].大连:大连理工大学,2019.
[56] 张志霞, 郝纹慧. 基于知识元的突发灾害事故动态情景模型[J]. 油气储运, 2019, 38(9): 980-987, 995.
[57] The RAND Corporation. Seeking examples of long-term decisions[EB/OL]. [2022-08-20]. https://www.rand.org/pardee/LongTermDecisions/seeking.html.
[58] ÇALI S, BALAMAN Y. Improved decisions for marketing, supply and purchasing: mining big data through an integration of sentiment analysis and intuitionistic fuzzy multi criteria assessment[J]. Computers & industrial engineering, 2019, 129(1): 315-332.
[59] CHEN X, ZHANG W, XU X, et al. A public and large-scale expert information fusion method and its application: mining public opinion via sentiment analysis and measuring public dynamic reliability[J]. Information fusion, 2022, 78(1): 71-85.
[60] MORGE M. The hedgehog and the fox[C]//International workshop on argumentation in multi-agent systems. Berlin: Springer, 2007: 114-131.
[61] CHARBONNEAU S, FYE S, HAY J, et al. A retrospective analysis of technology forecasting [C/OL] //AIAA SPACE 2013 conference and exposition. California: American Institute of Aeronautics and Astronautics, 2013[2022-06-14]. https://doi.org/10.2514/6.2013-5519. DOI:10.2514/6.2013-5519.
[62] PAGE S E. The difference: how the power of diversity creates better groups, firms, schools, and societies-new edition[M]. New Jersey: Princeton University Press, 2008[2022-06-14].https://doi.org/10.2514/6.2013-5529.DOI:10.2514/6.2013-5519.
[63] HEYEOL K. Data-driven technology foresight: text analysis of emerging technologies[D]. Seoul: Seoul National University, 2018.
[64] DARPA and data: a portfolio overview[EB/OL]. [2022-08-13]. https://www.nitrd.gov/nitrdgroups/images/3/31/DARPA-and-DATA.pdf.
[65] MORSTATTER F, GALSTYAN A, SATYUKOV G, et al. SAGE: a hybrid geopolitical event forecasting system[C]//Proceedings of the 28th international joint conference on artificial intelligence. California: AAAI Press, 2019: 6557-6559.
[66] TETLOCK P E, GARDNER D. Super forecasting: the art and science of prediction[M]. New York: Random House, 2016.
文章导航

/