[Purpose/significance] With the increase in the amount and types of medical information resources and in the interdisciplinarity of the related works, it has become challenging for researchers and information personnel to grasp the theme development.[Method/process] Considering the prominent position of medical research among all the subject areas in scientific research, the authors carried out a new topic evolution detection method. The authors also proposed a model based on the LDA model for judging the topic evolution in medical researches and demonstrated its operating process. The main stages in the process included medical words extraction, topic area identification, topic association, key topic identification, the identification of the main path of key topics and the splitting and merging events on the main path.[Result/conclusion] This paper continues to take the study of breast neoplasms treatment research as a field to test the new model for identifying the topic evolution in the medical research. The test results are highly concordant with authoritative literature reviews in the field and are further confirmed by interviews with the field's leading experts; thus verifying the reliability of the techniques and approaches proposed by the study.
Gong Xiaocui
,
An Xinying
. A Research of Topic Splitting and Merging Detecting in the Medical Field Based on the LDA Model[J]. Library and Information Service, 2017
, 61(18)
: 76
-83
.
DOI: 10.13266/j.issn.0252-3116.2017.18.010
[1] 程薛柯,苏成.基于共词分析的世界肿瘤学研究主题演化分析[J].国际肿瘤学杂志,2015,42(10):795-800.
[2] 李勇,安新颖.基于LDA的主题演化研究[J].医学信息学杂志,2013,34(2):57-61.
[3] 王莉亚.主题演化研究进展[J].情报探索,2014(4):29-31.
[4] 刘自强,王效岳,白如江.语义分类的学科主题演化分析方法研究——以我国图书情报领域大数据研究为例[J].图书情报工作,2016,60(15):76-85.
[5] 马费成,张勤.国内外知识管理研究热点——基于词频的统计分析[J].情报学报,2006(2):146-151.
[6] 王莉亚.基于关键词突变的主题突变研究[J].情报理论与实践,2013,36(11):45-48.
[7] 安新颖.基于改进信息熵的干细胞研究领域共词分析[J].图书情报工作,2011,55(2):37-40.
[8] 侯剑华,吕东博,王鹏.从硕士学位论文看我国科学技术哲学研究的转向——基于对硕士学位论文的计量分析[J].黑龙江高教研究,2014(2):7-10.
[9] 倪文珊,宗乾进,袁勤俭.国际电子商务研究主题演化及启示——基于Web of Science的计量分析[J].现代情报,2013,33(8):84-88.
[10] SMALL H. Tracking and predicting growth areas inscience[J].Scientometrics,2006,68(3):595-610.
[11] ALLAN J,CARBONELL J, DODDINGTON G,et al. Topic detection and tracking pilot study:final report[J].Proceedings of the DARPA broadcast news transcription andunderstandingworkshop,1998:194-218.
[12] GRIFFITHS T,STEYVERS M.Finding scientific topics[J]. Proceedings of the National Academy of Sciences of the United States of America,2004,101(S1):5228-5235.
[13] ALSUMAIR L,BARBARA D,DOMENICONI C. On-line LDA:adaptive topic models for mining text streams with applications to topic detection and tracking[C]//Proceeding of the 2008 eighth IEEE international conference on data mining.Fairfax:IEEE Computer Society,2008:3-12.
[14] BLEI D M,LAFFETY J D.Dynamic topic models[C]//Proceedings of the 23rd international conference on machine learning.Pennsylvania:ACM,2006:113-120.
[15] 程齐凯,王晓光.一种基于共词网络社区的科研主题演化分析框架[J].图书情报工作,2013,57(8):91-96.
[16] 李湘东,张娇,袁满.基于LDA模型的科技期刊主题演化研究[J].情报杂志,2014,33(7):115-121.
[17] HUAGONG_ADU.主题模型-LDA浅析[EB/OL].[2017-06-10].http://blog.csdn.net/huagong_adu/article/details/7937616.
[18] 陈亮,杨冠灿,张静,等.面向技术演化分析的多主路径方法研究[J].图书情报工作,2015,59(10):124-130.
[19] ASUR S,PARTHASARATHY S,UCAR D.An event-based framework for characterizing the evolutionary behavior of interaction graphs[C]//Proceedingsof the 13th ACM SIGKDD conference.Columbus:ACM,2007:1-35.