情报研究

基于SNA和DMR方法的高血压主题探测与演化趋势比较研究

  • 周利琴 ,
  • 徐健 ,
  • 巴志超 ,
  • 张斌
展开
  • 1. 武汉大学信息资源研究中心 武汉 430072;
    2. 武汉大学中国传统文化研究中心 武汉 430072;
    3. 武汉大学国家文化发展研究院 武汉 430072
周利琴(ORCID:0000-0001-5105-2669),博士研究生,E-mail:zhoulq92@163.com;徐健(ORCID:0000-0002-0230-5137),博士研究生;巴志超(ORCID:0000-0001-5626-5604),博士研究生;张斌(ORCID:0000-0002-5591-7874),博士后,讲师。

收稿日期: 2017-10-08

  修回日期: 2018-03-12

  网络出版日期: 2018-07-05

基金资助

本文系国家自然科学基金国际(地区)合作与交流项目"基于慢病知识管理的智慧养老平台研究"(项目编号:71661167007)、国家自然科学基金重点国际(地区)合作研究项目"大数据环境下的知识组织与服务创新"(项目编号:71420107026)和国家自然科学基金青年项目"心智空间视角下科学知识生成与演化机理研究"(项目编号:71704138)研究成果之一。

Comparative Analysis of the Topic and Evolution Trend of Hypertension Study Based on SNA and DMR

  • Zhou Liqin ,
  • Xu Jian ,
  • Ba Zhichao ,
  • Zhang Bin
Expand
  • 1. Center for Studies of Information Resources, Wuhan University, Wuhan 430072;
    2. Center of Traditional Chinese Cultural Studies, Wuhan University, Wuhan 430072;
    3. National Institute of Cultural Development, Wuhan University, Wuhan 430072

Received date: 2017-10-08

  Revised date: 2018-03-12

  Online published: 2018-07-05

摘要

[目的/意义]探测高血压医学文献的主题和演化趋势,对发现高血压领域的研究热点和前沿,理解高血压领域概况和促进专家之间的知识交流具有重要意义。[方法/过程]以PubMed数据库下载的26 717篇与高血压相关的文献题录数据作为研究对象,抽取高频主题词构造共现矩阵,同时采用社会网络分析(SNA)和狄利克雷多项回归(DMR)主题模型从中观、微观层面探测高血压医学文献的主题分布和演化趋势;比较这两种方法的关联和异同点。[结果/结论]研究发现,高血压医学文献主要集中在危险因素、研究方法、基本要素、诊断治疗和动物实验这5个研究主题,主题的相对分布比率随着时间变化而不断改变。利用SNA方法获取的主题词更加具体和明确,而DMR方法获取的主题词更加宽泛,但在探索各个主题的演化趋势方面比较有优势。

本文引用格式

周利琴 , 徐健 , 巴志超 , 张斌 . 基于SNA和DMR方法的高血压主题探测与演化趋势比较研究[J]. 图书情报工作, 2018 , 62(13) : 82 -91 . DOI: 10.13266/j.issn.0252-3116.2018.13.011

Abstract

[Purpose/significance] Exploring the topic and evolution trend of hypertension literature is of great significance for users to understand the profile, research hot-spots and frontiers of chronic disease, and can promote the knowledge communication among experts.[Method/process] This paper takes the Hypertension and 26717 articles from PubMed database as the research object, extracts high-frequency Mesh Terms to construct a co-occurrence matrix. Social network analysis is applied to detect the community and topic distribution of the hypertension study literature, and the expanded topic modeling Dirichlet-multinomial regression is also used to explore the topic distribution and evolution trends. Then similarities and differences of the SNA and DMR method in topic detection are analyzed.[Result/conclusion] It is found that the hypertension literature is mainly concentrated on three communities, which can be divided into five research topics, such as risk factors, research methods, basic situation of patients, diagnosis and treatment, and animal experiments. The relative distribution of the topic varies with time change. It is also found that the topic obtained from SNA and DMR are basically similar. But the Mesh Terms obtained from SNA method are more specific and clearer, while the DMR is more broadly and have an advantage in exploring the evolution of various themes.

参考文献

[1] 中国高血压防治指南修订委员会. 中国高血压防治指南2010[J]. 中华心血管病杂志, 2011, 39(7):701-708.
[2] NATIONAL INSTITUTES OF HEALTH. Report:estimates of funding for various research, condition, and disease categories (RCDC)[EB/OL].[2017-06-02]. http://report.nih.gov/categorical_spending.aspx.
[3] SONG M, KIM S, ZHANG G, DING Y, et al. Productivity and influence in bioinformatics:A bibliometric analysis using PubMed central. Journal of the association for information science and technology, 2014, 65(2), 352-371.
[4] OH Y S, GALIS Z S. Anatomy of success:the top 100 cited scientific reports focused on hypertension research[J]. Hypertension, 2014, 63(4):641-647.
[5] SCHREIBER C, EDLINGER C, EDER S, et al. Global research trends in the medical therapy of pulmonary arterial hypertension in 2000-2014[J]. Pulmonary pharmacology & therapeutics, 2016, 39(8):21-27.
[6] GOTTING M, SCHWARZER M, GERBER A, et al. Pulmonary hypertension:scientometric analysis and density-equalizing mapping[J]. Plos one, 2017, 12(1):e0169238.
[7] DING Y.Community detection:Topological vs Topical[J].Journal of informetrics, 2011, 5(4):498-514.
[8] KLAVANS R, BOYACK K W. Identifying a better measure of relatedness for mapping science[J]. Journal of the American Society for Information Science and Technology, 2006, 57(2):251-263.
[9] RONDA-PUPO G A, GUERRAS-MARTIN L A. Dynamics of the evolution of the strategy concept 1962-2008:a co-word analysis[J]. Strategic management journal, 2012, 33(2):162-188.
[10] CHEN C. CiteSpace Ⅱ:Detecting and visualizing emerging trends and transient patterns in scientific literature[J]. Journal of the American society for Information Science and Technology, 2006, 57(3):359-377.
[11] SWANSON D R. Two medical literatures that are logically but not bibliographically connected[J]. Journal of the Association for Information Science. 1987:228-233.
[12] LINDSAY R K, GORDON M D. Literature-based discovery by lexical statistics[J]. Journal of the American Society for Information Science and Technology, 1999, 50(7):574-587.
[13] GORDON M D, LINDSAY R K. Toward discovery support systems:a replication, re-examination, and extension of Swanson's work on literature-based discovery of a connection between Raynaud's and fish oil[J]. Journal of the Association for Information Science and Technology, 1996, 47(2):116-128.
[14] 祝清松, 冷伏海. 基于引文内容分析的高被引论文主题识别研究[J]. 中国图书馆学报, 2014, 40(1):39-49.
[15] WASSERMAN S, FAUST K. Social network analysis:methods and applications.[J]. Contemporary sociology, 1994, 91(435):219-220.
[16] 朱庆华, 李亮. 社会网络分析法及其在情报学中的应用[J]. 情报理论与实践, 2008, 31(2):179-183.
[17] 徐媛媛, 朱庆华. 社会网络分析法在引文分析中的实证研究[J]. 情报理论与实践, 2008, 31(2):184-188.
[18] 李亮, 朱庆华. 社会网络分析方法在合著分析中的实证研究[J]. 情报科学, 2008, 26(4):549-555.
[19] BUTTS C T. Social network analysis:a methodological introduction[J]. Asian journal of social psychology, 2008, 11(1):13-41.
[20] 王洪伟, 高松, 陆頲. 基于LDA和SNA的在线新闻热点识别研究[J]. 情报学报, 2016, 35(10):1022-1037.
[21] WALLACE M L, GINGRAS Y, DUHON R.A new approach for detecting scientific specialties from raw cocitation networks[J]. Journal of the American Society for Information Science and Technology, 2009, 60(2):240-246.
[22] GROSS A, MURTHY D. Modeling virtual organizations with Latent Dirichlet Allocation:a case for natual language processing[J]. Neural networks, 2014,58:38-49.
[23] TANG X B, Fang X K. Research on the subject retrieval of the weibo based on the integration of text clustering and LDA[J]. Information studies:theory & application, 2013,8:85-90.
[24] ZHANG P J, SONG L. Overview on topic modeling method of microblogs text based on LDA[J]. Library and information service, 2012,24:120-126.
[25] DEERWESTER S, DUMAIS S T, FURNAS G W, et al. Indexing by latent semantic analysis[J]. Journal of the American Society for Information Science, 2010, 41(6):391-401.
[26] HOFMANN T. Probabilistic latent semantic indexing[C]//Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval. New York:ACM, 1999:50-57.
[27] BLEI D M, NG A Y, JORDAN M I. Latent dirichlet allocation[J]. Journal of machine Learning research archive, 2003, 3:993-1022.
[28] MIMNO D, MCCALLUM A. Topic models conditioned on arbitrary features with Dirichlet-multinomial Regression[J]. University of Massachusetts-Amherst, 2012, 2008:411-418.
[29] SONG M, HEO G E, LEE D. Identifying the landscape of Alzheimer's disease research with network and content analysis[J]. Scientometrics, 2015, 102(1):905-927.
[30] CUI L. Development of a text mining system based on the co-occurrence of bibliographic items in literature databases[J]. New technology of library and information service,2008, 24(8):70-75.
[31] BASTIAN M, HEYMANN S, JACOMY M. Gephi:an open source software for exploring and manipulating networks[C]//Proceedings of the third international conference on Weblogs and social media. California:ICWSM, 2009.
[32] YANG Y, WU M, CUI L. Integration of three visualization methods based on co-word analysis[J]. Scientometrics, 2012, 90(2):659-673.
[33] 程齐凯, 王晓光. 一种基于共词网络社区的科研主题演化分析框架[J]. 图书情报工作, 2013, 57(8):91-96.
[34] NEWMAN M E J, GIRVAN M. Finding and evaluating community structure in networks[J]. Physical review E, 2004, 69(2):026113.
[35] BLONDEL V D, GUILLAUME J L, LAMBIOTTE R,et al. Fast unfolding of communities in large networks[J]. Journal of statistical mechanics:theory and experiment, 2008(10):155-168.
[36] BRIN S, PAGE L. The anatomy of a large-scale hypertextual Web search engine[J]. Computer networks, 1998, 30(1-7):107-117.
[37] LAMBIOTTE R, DELVENNE J C, BARAHONA M. Laplacian dynamics and multiscale modular structure in networks[J]. Physics, 2008,1(2):1-29.
[38] WALLACH H M, MIMNO D M, MCCALLUM A. Rethinking LDA:Why priors matter[J]. Advances in neural information processing systems, 2009, 23:1973-1981.
[39] WANG Y, AGICHTEIN E, BENZI M. TM-LDA:efficient online modeling of latent topic transitions in social media[C]//Proceedings of the 18th ACM SIGKDD international conference on knowledge discovery and data mining. New York:ACM, 2012:123-131.
文章导航

/