图书情报工作 ›› 2017, Vol. 61 ›› Issue (18): 76-83.DOI: 10.13266/j.issn.0252-3116.2017.18.010

• 情报研究 • 上一篇    下一篇

基于LDA模型的医学领域主题分裂融合探测

宫小翠, 安新颖   

  1. 中国医学科学院医学信息研究所 北京 100020
  • 收稿日期:2017-06-12 修回日期:2017-07-24 出版日期:2017-09-20 发布日期:2017-09-20
  • 通讯作者: 安新颖(ORCID:0000-0002-9870-7009),副研究员,博士,通讯作者,E-mail:an.xinying@imicams.ac.cn
  • 作者简介:宫小翠(ORCID:0000-0001-6815-3546),研究实习员,硕士。
  • 基金资助:
    国家自然科学基金项目"基于语义的医学领域前沿知识发现及演化机制研究"(项目编号:71303259)和中国医学科学院中央公益性基本科研业务费课题"面向医学科技评价的多源异构数据处理机制研究"(项目编号:2016ZX330027)研究成果之一。

A Research of Topic Splitting and Merging Detecting in the Medical Field Based on the LDA Model

Gong Xiaocui, An Xinying   

  1. Institute of Medical Information/Medical Library, CAMS & PUMC, Beijing 100020
  • Received:2017-06-12 Revised:2017-07-24 Online:2017-09-20 Published:2017-09-20

摘要: [目的/意义]随着信息资源在数量和种类上的急剧增长,学科间的交叉融合不断涌现,快速主动地从海量信息资源中识别和判断研究主题的发展演化是实现科技创新的基础。[方法/过程]在相关理论调研的基础上,结合医学领域的资源特点,提出一种基于LDA模型的主题演化探测模型和相应的流程步骤。主要步骤包括医学主题词抽取、主题识别、主题关联、关键主题识别、关键主题的演化主路径识别、演化主路径上主题分裂、融合事件识别,实现深度、细致的主题演化分析。[结果/结论]选用乳腺癌治疗研究文献为实验案例,对判断模型进行试验并对结果进行分析验证,证实提出的技术方法具有一定的可靠性。

关键词: LDA, 主题分裂, 主题融合, 主题演化

Abstract: [Purpose/significance] With the increase in the amount and types of medical information resources and in the interdisciplinarity of the related works, it has become challenging for researchers and information personnel to grasp the theme development.[Method/process] Considering the prominent position of medical research among all the subject areas in scientific research, the authors carried out a new topic evolution detection method. The authors also proposed a model based on the LDA model for judging the topic evolution in medical researches and demonstrated its operating process. The main stages in the process included medical words extraction, topic area identification, topic association, key topic identification, the identification of the main path of key topics and the splitting and merging events on the main path.[Result/conclusion] This paper continues to take the study of breast neoplasms treatment research as a field to test the new model for identifying the topic evolution in the medical research. The test results are highly concordant with authoritative literature reviews in the field and are further confirmed by interviews with the field's leading experts; thus verifying the reliability of the techniques and approaches proposed by the study.

Key words: LDA, topic splitting, topic merging, topic evolution

中图分类号: