[Purpose/significance] Ultrasound examination is an important basis for diagnosis, but the major examination data is in the form of text. So, based these data, this paper studies a method that can automatically structure natural language texts and construct knowledge network, which lays the data foundation for further mining clinical knowledge hidden in EMR.[Method/process] This paper improved the application of natural language processing technology in ultrasonic, including three main steps:segmentation processing, content location and structured recognition, to realize the segmentation and labeling of ultrasonic text, and on this basis, the ultrasound examination knowledge network was established.[Result/conclusion] The test results of real data show that the method for structuring ultrasound texts proposed in this paper has better performance. This method can realize the automatic construction of knowledge network of batch ultrasound texts, and can reflect the potential knowledge of hierarchical relationship and attribute structure of structured content in ultrasonic text.
[1] 陈永莉,洪漪.检索语言在医学信息管理与检索中的应用综述[J].图书情报知识,2015(3):72-79.
[2] 郭熙铜,张晓飞,刘笑笑,等.数据驱动的电子健康服务管理研究:挑战与展望[J].管理科学, 2017,30(1):3-14.
[3] JIMÉNEZ P, CORCHUELO R. On learning web information extraction rules with TANGO[J]. Information systems, 2016, 62(12):74-103.
[4] 刘峤,李杨,段宏,等.知识图谱构建技术综述[J].计算机研究与发展,2016,53(3):582-600.
[5] 张义,李治江.基于高斯词长特征的中文分词方法[J].中文信息学报,2016,30(5):89-93.
[6] 郭顺利,张向先.面向中文图书评论的情感词典构建方法研究[J].现代图书情报技术,2016,32(2):67-74.
[7] STANFORD NLP.The stanford natural language progressing group[EB/OL].[2018-06-09]. https://nlp.stanford.edu/.
[8] JIEBA.结巴中文分词[EB/OL].[2018-04-09].http://www.oss.io/p/fxsjy/jieba.
[9] LTP.语言云[EB/OL].[2018-04-08].https://www.ltp-cloud.com/.
[10] 王兰英,雍文明,王连柱,等.中美医学论文英文摘要文本对比分析[J].科技与出版,2011(11):78-82.
[11] 刘洋,崔雷.引文上下文在文献内容分析中的信息价值研究[J].图书情报工作,2014,58(6):101-104.
[12] ZHANG S, TIAN K, ZHANG X, et al. Speculation detection for Chinese clinical notes:impacts of word segmentation and embedding models[J]. Journal of biomedical informatics, 2016, 60:334-341.
[13] 于跃,徐志健,王坤,等.基于双聚类方法的生物医学信息学文本数据挖掘研究[J].图书情报工作,2012,56(18):133-136.
[14] FINLAYSON S G, LEPENDU P, SHAH N H. Building the graph of medicine from millions of clinical narratives[J]. Scientific data, 2014, 1:140032.
[15] 郭少友,李亚菲,梁园园.基于细粒度语义化描述的医学文本检索[J].情报理论与实践,2015,38(8):130-134.
[16] 魏巍,郑杜.融合统计学习和语义过滤的ADR信号抽取模型构建研究[J].图书情报工作,2017,62(5):115-124.
[17] 李国垒,陈先来,夏冬,等.中文病历文本分词方法研究[J].中国生物医学工程学报,2016,35(4):477-481.
[18] 张晔,张晗,尹玢璨,等.基于电子病历利用支持向量机构建疾病预测模型——以重度急性胰腺炎早期预警为例[J].现代图书情报技术,2016,32(2):83-89.
[19] LEI J, TANG B, LU X, et al. A comprehensive study of named entity recognition in Chinese clinical text[J]. Journal of the American medical informatics association, 2014, 21(5):808-814.
[20] LIANG J, XIAN X, HE X, et al. A novel approach towards medical entity recognition in Chinese clinical text[J]. Journal of healthcare engineering, 2017, 2017.
[21] JENSEN P B, JENSEN L J, Brunak S. Mining electronic health records:towards better research applications and clinical care[J]. Nature reviews genetics, 2012, 13(6):395-405.
[22] 李国垒,陈先来,夏冬,等.面向临床决策的电子病历文本潜在语义分析[J].现代图书情报技术,2016,32(3):50-57.
[23] WANG H, ZHANG W, ZENG Q, et al. Extracting important information from Chinese operation notes with natural language processing methods[J]. Journal of biomedical informatics, 2014, 48:130-136.
[24] HE B, DONG B, GUAN Y, et al. Building a comprehensive syntactic and semantic corpus of Chinese clinical texts[J]. Journal of biomedical informatics, 2017, 69:203-217.
[25] 张盈利,夏小玲.非结构化病理文本的结构化信息抽取方法[J].医学信息学杂志,2016,37(4):54-58.
[26] 陈德华,冯洁莹,乐嘉锦,等.中文病理文本的结构化处理方法研究[J].计算机科学,2016,43(10):272-276.
[27] 丁祥武,张夕华.医疗领域文本结构化[J].计算机工程与设计,2017,38(10):2873-2878.
[28] DONG X, CHOWDHURY S, QIAN L, et al. Transfer bi-directional LSTM RNN for named entity recognition in Chinese electronic medical records[C]//Dalian, Liaoning, China:2017 IEEE 19th International Conference one-Health Networking, Applications and Services (Healthcom). Dalian:IEEE, 2017.
[29] 王鹏远,姬东鸿.基于多标签CRF的疾病名称抽取[J].计算机应用研究,2017,34(1):118-122.
[30] 侯伟涛,姬东鸿.基于Bi-LSTM的医疗事件识别研究[J].计算机应用研究,2018,35(7):1974-1977.
[31] BEAN D M, WU H, IQBAL E, et al. Knowledge graph prediction of unknown adverse drug reactions and validation in electronic health records[J]. Scientific reports, 2017, 7(1):16416.
[32] ROTMENSCH M, HALPERN Y, TLIMAT A, et al. Learning a health knowledge graph from electronic medical records[J]. Scientific reports, 2017, 7(1):5994.
[33] 黄梦醒,李梦龙,韩惠蕊.基于电子病历的实体识别和知识图谱构建的研究[J/OL].计算机应用研究:1-7[2019-03-12].http://kns.cnki.net/kcms/detail/51.1196.TP.20181129.1122.011.html.
[34] CHARIKAR M S. Similarity estimation techniques from rounding algorithms[C]//Montreal, Quebec, Canada:Proceedings of the thirty-fourth annual ACM symposium on Theory of computing. ACM, 2002:380-388.
[35] REZAEIAN N, NOVIKOVA G M. Detecting near-duplicates in Russian documents through using fingerprint algorithm Simhash[J]. Procedia computer science, 2017, 103:421-425.