图书情报工作 ›› 2018, Vol. 62 ›› Issue (13): 82-91.DOI: 10.13266/j.issn.0252-3116.2018.13.011

• 情报研究 • 上一篇    下一篇

基于SNA和DMR方法的高血压主题探测与演化趋势比较研究

周利琴1, 徐健1, 巴志超1, 张斌2,3   

  1. 1. 武汉大学信息资源研究中心 武汉 430072;
    2. 武汉大学中国传统文化研究中心 武汉 430072;
    3. 武汉大学国家文化发展研究院 武汉 430072
  • 收稿日期:2017-10-08 修回日期:2018-03-12 出版日期:2018-07-05 发布日期:2018-07-05
  • 作者简介:周利琴(ORCID:0000-0001-5105-2669),博士研究生,E-mail:zhoulq92@163.com;徐健(ORCID:0000-0002-0230-5137),博士研究生;巴志超(ORCID:0000-0001-5626-5604),博士研究生;张斌(ORCID:0000-0002-5591-7874),博士后,讲师。
  • 基金资助:
    本文系国家自然科学基金国际(地区)合作与交流项目"基于慢病知识管理的智慧养老平台研究"(项目编号:71661167007)、国家自然科学基金重点国际(地区)合作研究项目"大数据环境下的知识组织与服务创新"(项目编号:71420107026)和国家自然科学基金青年项目"心智空间视角下科学知识生成与演化机理研究"(项目编号:71704138)研究成果之一。

Comparative Analysis of the Topic and Evolution Trend of Hypertension Study Based on SNA and DMR

Zhou Liqin1, Xu Jian1, Ba Zhichao1, Zhang Bin2,3   

  1. 1. Center for Studies of Information Resources, Wuhan University, Wuhan 430072;
    2. Center of Traditional Chinese Cultural Studies, Wuhan University, Wuhan 430072;
    3. National Institute of Cultural Development, Wuhan University, Wuhan 430072
  • Received:2017-10-08 Revised:2018-03-12 Online:2018-07-05 Published:2018-07-05

摘要: [目的/意义]探测高血压医学文献的主题和演化趋势,对发现高血压领域的研究热点和前沿,理解高血压领域概况和促进专家之间的知识交流具有重要意义。[方法/过程]以PubMed数据库下载的26 717篇与高血压相关的文献题录数据作为研究对象,抽取高频主题词构造共现矩阵,同时采用社会网络分析(SNA)和狄利克雷多项回归(DMR)主题模型从中观、微观层面探测高血压医学文献的主题分布和演化趋势;比较这两种方法的关联和异同点。[结果/结论]研究发现,高血压医学文献主要集中在危险因素、研究方法、基本要素、诊断治疗和动物实验这5个研究主题,主题的相对分布比率随着时间变化而不断改变。利用SNA方法获取的主题词更加具体和明确,而DMR方法获取的主题词更加宽泛,但在探索各个主题的演化趋势方面比较有优势。

关键词: 高血压, 主题探测, SNA, DMR, 主题模型, 演化趋势

Abstract: [Purpose/significance] Exploring the topic and evolution trend of hypertension literature is of great significance for users to understand the profile, research hot-spots and frontiers of chronic disease, and can promote the knowledge communication among experts.[Method/process] This paper takes the Hypertension and 26717 articles from PubMed database as the research object, extracts high-frequency Mesh Terms to construct a co-occurrence matrix. Social network analysis is applied to detect the community and topic distribution of the hypertension study literature, and the expanded topic modeling Dirichlet-multinomial regression is also used to explore the topic distribution and evolution trends. Then similarities and differences of the SNA and DMR method in topic detection are analyzed.[Result/conclusion] It is found that the hypertension literature is mainly concentrated on three communities, which can be divided into five research topics, such as risk factors, research methods, basic situation of patients, diagnosis and treatment, and animal experiments. The relative distribution of the topic varies with time change. It is also found that the topic obtained from SNA and DMR are basically similar. But the Mesh Terms obtained from SNA method are more specific and clearer, while the DMR is more broadly and have an advantage in exploring the evolution of various themes.

Key words: hypertension, community detection, SNA, DMR, topic model, evolution trend

中图分类号: