KNOWLEDGE ORGANIZATION

Plant Knowledge Mining and Organization Construction in Pre-Qin Classics from the Perspective of Digital Humanities

  • Wu Mengcheng ,
  • Lin Litao ,
  • Qi Yue ,
  • Huang Shuiqing ,
  • Wang Dongbo ,
  • Liu Liu
Expand
  • 1 College of Information Management, Nanjing Agricultural University, Nanjing 210095;
    2 Research Center for Humanities and Social Computing, Nanjing Agricultural University, Nanjing 210095;
    3 Research Center for Correlation of Domain Knowledge, Nanjing Agricultural University, Nanjing 210095

Received date: 2022-12-07

  Revised date: 2023-02-07

  Online published: 2023-07-06

Abstract

[Purpose/Significance] The knowledge mining of plants in pre-Qin classics and the construction of pre-Qin plant knowledge graph are of great significance for understanding the society and living conditions of ancient Chinese people. [Method/Process] This paper made a detailed labeling and quantitative analysis of plant words in pre-Qin classics. Based on CRF and a variety of deep learning models, an ancient Chinese plant named entity recognition model was constructed, and the performance of each model was compared and analyzed to determine the optimal model. A knowledge graph-oriented knowledge organization model of plants from classics was designed. [Result/Conclusion] The named entity recognition model for ancient Chinese plant based on the ancient Chinese pre-trained language model SikuRoBERTa achieved the best performance, and the harmonic average reached 85.44%, which provided an effective method for entity-based plant knowledge mining. The constructed knowledge graph for pre-Qin classics’ plant knowledge can aggregate and visually present plant entities and their related knowledge in the pre-Qin classics.

Cite this article

Wu Mengcheng , Lin Litao , Qi Yue , Huang Shuiqing , Wang Dongbo , Liu Liu . Plant Knowledge Mining and Organization Construction in Pre-Qin Classics from the Perspective of Digital Humanities[J]. Library and Information Service, 2023 , 67(12) : 103 -113 . DOI: 10.13266/j.issn.0252-3116.2023.12.010

References

[1] 时唯伟,周长行,刘凯,等.重金属Cd污染土壤的植物修复研究[J].中国资源综合利用, 2022, 40(9):93-95.
[2] 刘冰,向晓媚,谭璐,等.湖南省德夯峡谷生境种子植物功能性状多样性[J].西北植物学报, 2022, 42(9):1591-1599.
[3] 莫伟军.药用植物中医保健研究[J].核农学报, 2021, 35(3):768.
[4] 孟迎俊.《尔雅·释草》名物词研究[D].桂林:广西师范大学,2010.
[5] 邹俐,赵焕君,李娜,等.山豆根的本草考证及毒性分析[J].现代中医药, 2021, 41(5):19-23.
[6] 曲保全,刘墩,侯文斌,等.基于数据挖掘的古代中医典籍外用美发方剂用药规律分析[J].中国医药导报, 2021, 18(21):126-129, 149.
[7] 袁代昌,袁玲,袁盼盼,等.乌药的本草考证[J].山西中医, 2021, 37(7):55-58.
[8] 于娜娜,王莹莹,俞静漪.历史典籍《诗经》中的水生植物和湿生植物意象探析[J].湿地科学与管理, 2022, 18(5):54-57,61.
[9] 谭宏姣.古汉语植物命名研究[D].杭州:浙江大学, 2004.
[10] 王薇.《仪礼》名物词研究[D].长春:东北师范大学, 2005.
[11] 王凌云.中国当代新诗植物意象研究[D].昆明:云南大学, 2017.
[12] 马开颜,萧瑶,陈骞,等.数字人文视域下中国当代文学作品中的植物意象研究[J].数字人文研究, 2022, 2(2):35-45.
[13] 宋旭晖,于洪涛,李邵梅.基于图注意力网络字词融合的中文命名实体识别[J].计算机工程, 2022, 48(10):298-305.
[14] 李娜,白振田,包平.基于《方志物产》的古籍知识组织路径探析[J].古今农业, 2016(1):105-113.
[15] 徐晨飞,叶海影,包平.基于深度学习的方志物产资料实体自动识别模型构建研究[J].数据分析与知识发现, 2020, 4(8):86-97.
[16] 王菁薇,肖莉,骆嘉伟,等.基于《伤寒论》的命名实体识别研究[J].计算机与数字工程, 2021, 49(8):1584-1587.
[17] WANG Y, LU L, WU Y, et al. Polymorphic graph attention network for Chinese NER[J]. Expert systems with applications, 2022, 203(3):117467.
[18] 刘畅,王东波,胡昊天,等.面向数字人文的融合外部特征的典籍自动分词研究——以SikuBERT预训练模型为例[J].图书馆论坛, 2022, 42(6):44-54.
[19] WANG Q, MAO Z, WANG B, et al. Knowledge graph embedding:a survey of approaches and applications[J]. IEEE transactions on knowledge and data engineering, 2017, 29(12):2724-2743.
[20] 周莉娜,洪亮,高子阳.唐诗知识图谱的构建及其智能知识服务设计[J].图书情报工作, 2019, 63(2):24-33.
[21] 周毅,刘峥,粟小青,等.融合多层次数据的问答知识图谱本体模型构建[J].图书情报工作, 2022, 66(5):125-132.
[22] 黄微,卢国强,赵旭.基于知识图谱的微博主题演变路径研究[J].情报理论与实践, 2022, 45(3):173-181.
[23] 张君冬.不孕症中医临床试验知识本体构建研究[D].北京:中国中医科学院, 2022.
[24] 张向先,李世钰,沈旺,等.数字人文视角下敦煌吐鲁番医药文献知识组织研究[J].图书情报工作, 2022, 66(22):28-43.
[25] 翟东升,娄莹,阚慧敏,等.基于多源异构数据的中医药知识图谱构建与应用研究[J/OL].数据分析与知识发现[2023-05-17]. http://kns.cnki.net/kcms/detail/10.1478.G2.20221223.1715.012.html.
[26] 羊艳玲,李燕,帅亚琦,等.基于中医医案的知识图谱构建[J].医学信息学杂志, 2022, 43(10):50-54.
[27] 李贺,祝琳琳,刘嘉宇,等.基于本体的简帛医药知识组织研究[J].图书情报工作, 2022, 66(22):16-27.
[28] 崔竞烽,郑德俊,王东波,等.基于深度学习模型的菊花古典诗词命名实体识别[J].情报理论与实践, 2020, 43(11):150-155.
[29] 张云中,郭冬,王亚鸽,等.基于知识图谱的红色历史人物知识问答服务框架研究[J].图书情报工作, 2021, 65(16):108-117.
[30] 刘欢,刘浏,王东波.数字人文视角下的领域知识图谱自动问答研究[J].科技情报研究, 2022, 4(1):46-59.
[31] 范青,史中超,谈国新.非物质文化遗产的知识图谱构建[J].图书馆论坛, 2021, 41(10):100-109.
[32] 钟远薪,夏翠娟.艺术图像知识图谱构建初探[J].图书馆论坛, 2022, 42(2):109-118.
[33] 张琪,江川,纪有书,等.面向多领域先秦典籍的分词词性一体化自动标注模型构建[J].数据分析与知识发现, 2021, 5(3):2-11.
[34] 古诗文网.古诗文经典传承[EB/OL].[2023-05-17]. https://www.gushiwen.cn/.
[35] 王东波,刘畅,朱子赫,等. SikuBERT与SikuRoBERTa:面向数字人文的《四库全书》预训练模型构建及应用研究[J].图书馆论坛, 2022, 42(6):31-43.
[36] Hugging Face. hfl/chinese-roberta-wwm-ext[EB/OL].[2023-04-25]. https://huggingface.co/hfl/chinese-roberta-wwm-ext.
[37] ETHAN. Ethan-yt/guwenbert[CP/OL].[2023-04-19]. https://github.com/Ethan-yt/guwenbert.
[38] DEVLIN J, CHANG M W, LEE K, et al. BERT:pre-training of deep bidirectional transformers for language understanding[C]//Proceedings of the 2019 conference of the North American Chapter of the Association for Computational Linguistics:human language technologies. Minneapolis:Association for Computational Linguistics, 2019:4171-4186.
[39] ATTERER M, SCHÜTZE H. Prepositional phrase attachment without oracles[J]. Computational linguistics, 2007, 33(4):469-476.
[40] 植物通.植物数据库[EB/OL].[2023-05-17]. https://www.zhiwutong.com/.
[41] iPlant植物智——植物物种信息系统[EB/OL].[2023-05-17].https://www.iplant.cn/.
Outlines

/