[目的/意义]科技文献中各种特征项及其之间的关联是构成多种多样共现现象的基本单元,通过挖掘共现特征项之间的关联,共现分析可以从不同角度探测科学与技术活动规律的方方面面,为科研管理者和研究者等提供一个全方位、多角度观察科学发展的新视角。[方法/过程]通过对多重共现的基础理论研究,构建一套独特的多重共现数据模型基础理论体系,该理论体系包括:多重共现的定义、多重共现的研究范畴、用于多重共现的变量符号、多重共现的矩阵定义、多重共现的数据组织形式以及多重共现的延展系数计算公式与应用范畴。此外,基于多重共现的交叉图可视化方式,构建可用于分析3个或以上特征项共现关系的知识发现方法,包括共现关联强度、被引关联强度以及共现突发强度的分析方法。[结果/结论]通过该基础理论体系的构建,拓展共现现象的研究范围,为共现分析走向多角度、多维度的多重共现分析提供基础理论的支持。并通过实证研究,选取不同的多重共现应用案例,证明该方法可应用于研究领域、研究机构、机构间对比、研究学者等方面的分析,同时具有较好的分析效果。由于该方法体系具有分析角度多维化和分析方法多样化的特点,通过该方法的分析,除能够实现一重、二重共现等的分析效果外,还能揭示出比一般共现更为广泛和深入的知识内容。
[Purpose/significance] Various entities and their associations are the basic units that constitute a variety of occurrence phenomena in scientific literature. By mining the associations between occurrence entities, occurrence analysis can detect all aspects of the laws of scientific activities from different angles for scientific research management and researchers. It will provide a new perspective on the development of science from all angles and perspectives.[Method/process] By studying the basic theory of multiple occurrence, this paper constructs a set of unique basic theoretical system of multiple occurrence data model. The theoretical system includes definition of multiple occurrence, multiple occurrence research category, multiple occurrence variable symbols, multiple occurrence matrix definitions, multiple occurrence data organization forms, etc. In addition, based on the multiple occurrence cross-graph visualization method, this paper constructs a knowledge discovery method that can be used to analyze the occurrence relationship of three or more characteristic items, including the occurrence relevance strength, cited relevance strength and occurrence burst strength method.[Result/conclusion] Through the construction of this basic theoretical system, the research scope of occurrence phenomena is expanded, which provides the basic theory support for occurrence analysis to multi-angle and multi-dimension occurrence analysis. And through empirical research, different cases of multiple occurrence applications are selected, proving that the method can be applied to the analysis of research areas, research institutions, institutional contrast, research scholars, etc., and has good analysis results. Due to the multi-dimensional analysis and the diversification of analysis methods, this method can not only achieve the analysis effects of occurrence which includes one entity or two entities, but also reveal more extensive than the common occurrence and in-depth knowledge of content.
[1] 杨立英. 科技论文共现理论与应用[D]. 北京:中国科学院文献情报中心, 2007.
[2] 王曰芬,宋爽,苗露. 共现分析在知识服务中的应用研究[J]. 现代图书情报技术, 2006(4):29-34.
[3] FANO R, Information theory and the retrieval of recorded information[M]//Documentation in Action.New York:Reinhold Publ. Co., 1956:238-244.
[4] SMALL H. Maero-level changes in the strueture of co-eitation clusters:1983-1989[J]. Scientometries, 1993, 26(1):5-20.
[5] WHITE H, GRIFFITH B. Author co-citation:a literature measure of intelleetual structure[J]. Journal of the American Society for Information Scienee,1981, 32(3):163-169.
[6] CALLON M, LAW J, RIP A. Mapping the dynamics of science and technology:sociology of science in the real world[M].New York:Sheridan House,1986.
[7] 郑华川,崔雷. 胃癌前病变低频被引论文的共词和共篇聚类分析[J]. 中华医学图书情报杂志, 2002, 11(3):1-3.
[8] ZHAO D, ANDREAS S. Evolution of research activities and intellectual influences in information science 1996-2005:introducing author bibliographic-coupling analysis[J]. Journal of the American Society for Information Science and Technology, 2008, 59(13):2070-2086.
[9] 刘志辉,张志强.作者关键词耦合分析方法及实证研究[J]. 情报学报, 2010, 29(2):268-275.
[10] YANG L, MORRIS S, BARDEN E. Mapping institutions and their weak ties in a specialty:a case study of cystic fibrosis body composition research[J]. Scientometrics,2009(2):421-434.
[11] MORRIS S. Unified mathematical treatment of complex cascaded bipartite networks:the case of collections of journal papers[D]. Oklahoma:Oklahoma State University, 2005.
[12] MORRIS S, DEYONG C, WU Z, et al. DIVA:a visualization system for exploring document databases for technology forecasting[J]. Computers & industrial engineering,2002,43(4):841-862.
[13] 冷伏海,王林,李勇.基于文献关键词的三元共词分析方法——以知识发现领域为例[J].情报学报,2011(10):1072-1077.
[14] 张自立,张紫琼,李向阳.基于2-模网络的科研单位和关键词共现分析方法[J].情报学报,2011(12):1249-1260.
[15] LEYDESDORFF L. What can heterogeneity add to the scientometric map? steps towards algorithmic historiography[EB/OL].[2018-01-30]. http://arxiv.org/abs/1002.0532.
[16] LEYDESDORFF L, VAUGHAN L.Co-occurrence matrices and their applications in information science:extending ACA to the web environment[J]. Journal of the American Society for Information Science and Technology,2006,56(12):1616-1628.
[17] CHEN C, IBEKWE-SANJUAN, F, HOU J. The structure and dynamics of co-citation clusters:a multiple-perspective co-citation analysis[J]. Journal of the American Society for Information Science and Technology, 2010,61(7):1386-1409.
[18] 张婷. 科学传播研究的可视化分析[D]. 大连:大连理工大学, 2009.
[19] 刘则渊,陈悦,侯海燕,等. 科学知识图谱方法与应用[M]. 北京:人民出版社, 2008.
[20] 冯璐,冷伏海. 共词分析方法理论进展[J]. 中国图书馆学报, 2006(2):88-92.
[21] 胡琼芳,曾建勋. 基于多共现的文献相关度判定研究[J]. 情报理论与实践, 2010, 33(8):77-80.
[22] PANG H. A knowledge discovery method based on analysis of multiple co-occurrence relationships in collections of journal papers[J]. Chinese journal of library and information science,2012,5(4):9-20.
[23] 庞弘燊,方曙,范炜,等. 基于多重共现的机构科研状况分析方法研究——以中科院国家科学图书馆为例[J].情报学报,2012,31(11):1140-1152.
[24] 庞弘燊.基于科技论文多特征项共现突发强度分析方法的算法实现与可视化图谱研究[J].图书情报工作,2015,59(24):115-122.
[25] ENGLESMAN E, VAN RAAN A. Mapping of technology:a first exploration of knowledge diffusion amongst fields of technology[R]. Bangalore:CWTS report, 1991.