Empirical Study of a Semantic and Proximity-based Author Co-citation Analysis Method

  • Zhang Ruhao
Expand
  • Chengdu Library and Information Center, Chinese Academy of Sciences, Chengdu 610041 Department of Library and Information Science, School of Economics and Management, University of Chinese Academy of Sciences, Beijing 100049;National Science Library, Chinese Academy of Sciences, Beijing 100190

Received date: 2019-09-27

  Revised date: 2019-12-17

  Online published: 2020-04-20

Abstract

[Purpose/significance] The author co-citation analysis is an vital method to explore the domain knowledge structure. In the context of complex development of disciplines, the author’s relevance measure based on the co-citation frequency is quite controversial. The study proposed an improved method for author co-citation analysis based on the similarity of content semantics and the proximity of locations. [Method/process] Based on the introduction of its basic principles, the field of LIS was used as an example to demonstrate the effect of the method, a full-text mining of citations for CNKI Chinese journals was conducted, and the citing sentences and reference positions were then extracted. Combined with pre-trained domain word embedding models, the deep correlation between the co-cited literature and the strength of the connection between the authors were measured. A network analysis and a factor analysis were then used to compare the differences on effects between the method and the traditional method. [Result/conclusion] The results show that the method can more accurately identify the correlation strength between authors, and find more detailed subject knowledge structure, and has a certain scalability and applicability.

Cite this article

Zhang Ruhao . Empirical Study of a Semantic and Proximity-based Author Co-citation Analysis Method[J]. Library and Information Service, 2020 , 64(8) : 111 -124 . DOI: 10.13266/j.issn.0252-3116.2020.08.013

References

[1] WHITE H D, GRIFFITH B C. Author cocitation:a literature measure of intellectual structure.[J]Journal of the American society for information science, 1981, 32(3):163-171.
[2] BAYER A E, SMART J C, MCLAUGHLIN G W. Mapping intellectual structure of a scientific subfield through author cocitations[J]. Journal of the American society for information science, 1990, 41(6):444-452.
[3] BOYACK K W, SMALL H, KLAVANS R. Improving the accuracy of co-citation clustering using full text[J]. Journal of the American society for information science and technology, 2013,64(9):1759-1767.
[4] DING Y,ZHANG G,CHAMBERS T. Content-based citation analysis:the next generation of citation analysis[J]. Journal of the association for information science and technology, 2014, 65(9):1820-1833.
[5] 胡志刚.全文引文分析方法与应用[D]. 大连:大连理工大学, 2014.
[6] 刘盛博,丁堃,唐德龙.引用内容分析的理论与方法[J]. 情报理论与实践,2015,38(10):27-32.
[7] LIU S, CHEN C. The differences between latent topics in abstracts and citation contexts of citing papers[J].Journal of the American society for information science and technology, 2013, 64(3):627-639.
[8] DING Y, SONG M, HAN J, et al. Entitymetrics:measuring the impact of entities[J]. PLoS ONE, 2013, 8(8),1-14.
[9] 章成志, 徐庶睿, 卢超. 利用引文内容监测多学科交叉现象的方法与实证[J]. 图书情报工作, 2016, 60(19):108-115.
[10] NANBA H, OKUMURA M. Towards multi-paper summarization using reference information[C]//The commitee of international joint conferences on artificial intelligence. Proceedings of the 16th international joint conferences on artificial intelligence. San Francisco:Morgan Kaufmann Publishers, 1999:926-931.
[11] TEUFEI S, SIDDHARTHAN A, TIDHAR D. An annotation scheme for citation function[C]//ALEXANDERSSON J. Proceedings of the 7th SIGdial workshop on discourse and dialogue. Stroudsburg:Association for Computational Linguistics,2009:80-87.
[12] ZAFAR L, AHMED U, ISLAM M A. Citation context analysis using word-graph[C]//IEEE 20192nd International conference on communication, computing and digital systems (C-CODE). Islamabad:the Institute of Electrical and Electronics Engineers,2019:120-125.
[13] ABU-JBARA A, EZRA J, RADEV D R. Purpose and polarity of citation:towards NLP-based bibliometrics[C]//Proceedings of the 2013 conference of the North American chapter of the association for computational linguistics:human language technologies. Atlanta:Association for Computational Linguistics, 2013:596-606.
[14] 雷声伟,陈海华,黄永,等.学术文献引文上下文自动识别研究[J].图书情报工作, 2016, 60(17):78-87.
[15] ANGROSH M A, CRANEFIELD S, STANGER N. Context identification of sentences in related work sections using a conditional random field:towards intelligent digital libraries[C]//HUNTER J. Proceedings of the 10th annual joint conference on digital libraries. New York:ACM, 2010:293-302.
[16] ZHU X, TURNEY P, LEMIRE D, et al. Measuring academic influence:not all citations are equal[J]. Journal of the association for information science and technology, 2015, 66(2):408-427.
[17] SOMBATSOMPOP N, KOSITCHAIYONG A, MARKPIN T, et al. Scientific evaluations of citation quality of international research articles in the SCI database:thailand case study[J].Scientometrics,2006,66(3):521-535.
[18] CHEN C, LIU Z. Where are citations located in the body of scientific articles? A study of the distributions of citation locations[J].Journal of informetrics,2013,7(4):887-896.
[19] LU C, DING Y, ZHANG C. Understanding the impact change of a highly cited article:a content-based citation analysis[J]. Scientometrics, 2017, 112(2), 927-945.
[20] DING Y, LIU X, GUO C, et al. The distribution of references across texts:some implications for citation analysis[J].Journal of Informetrics, 2013, 7(3):583-592.
[21] HOU W R, LI M,NIU D K. Counting citations in texts rather than reference lists to improve the accuracy of assessing scientific contribution[J].Bioessays,2011,33(10):724-727.
[22] ELKISS A, SHEN S, FADER A, et al. Blind men and elephants:what do citation summaries tell us about a research article?[J]. Journal of the American society for information science and technology, 2008, 59(1):51-62.
[23] CALLAHAN A, HOCKEMA S, EYSENBACH G. Contextual cocitation:augmenting cocitation analysis and its applications[J]. Journal of the American society for information science and technology, 2010, 61(6):1130-1143.
[24] GIPP B, BEEL J. Citation proximity analysis (CPA)-a new approach for identifying related work based on co-citation analysis[C]//LARSEN B, LETA J. Proceedings of ISSI 2009-The 12th international conference on scientometrics and informetrics. Rio de Janeiro:BIREME/PAHO/WHO and Federal University of Rio de Janeiro, 2009:571-575.
[25] GIPP B. Citation proximity analysis-a measure to identify related work[D]. Magdeburg:Otto-von-Guericke University, 2006.
[26] LIU S, CHEN C. The effects of cocitation proximity of cocitation analysis[C]//NOYONS E, NGULUBE P, LETA J. Proceedings of ISSI 2011- the 13th international conference on scientometrics and informetrics. Durban:Leiden University and University of Zululand, 2011:474-484.
[27] AN J, KIM N, KAN M Y, et al. Exploring characteristics of highly cited authors according to citation location and content[J]. Journal of the Association for information science and technology, 2017, 68(8):1975-1988.
[28] ETO M. Evaluations of context-based co-citation searching[J]. Scientometrics, 2013, 94(2):651-673.
[29] 赵蓉英, 郭凤娇, 曾宪琴.基于位置的共被引分析实证研究[J].情报学报, 2016, 35(5):492-500.
[30] JEONG Y K, SONG M, DING Y. Content-based author co-citation analysis[J]. Journal of informetrics, 2014, 8(1):197-211.
[31] LU K, WOLFRAM D. Measuring author research relatedness:a comparison of word-based, topic-based, and author cocitation approaches[J]. Journal of the American society for information science, 2012, 63(10):1973-1986.
[32] 祝清松, 冷伏海. 基于引文内容分析的高被引论文主题识别研究[J]. 中国图书馆学报, 2014,40(1):39-49.
[33] 刘盛博, 张春博, 丁堃, 等. 基于引用内容与位置的共被引分析改进研究[J]. 情报学报, 2013, 32(12):1248-1256.
[34] KIM H J, JEONG Y K, SONG M. Content- and proximity-based author co-citation analysis using citation sentences[J]. Journal of informetrics, 2016, 10(4):954-966.
[35] 李秀霞,邵作运.融入内容信息的作者共被引分析——以学科服务研究主题为例[J].图书情报工作,2016,60(1):98-104, 141.
[36] 肖雪,陈云伟,邓勇.基于节点内容及拓扑结构的引文网络社团划分[J].图书情报知识,2017(1):89-97.
[37] 张艺蔓,马秀峰,程结晶.融合引文内容和全文本引文分析的知识流动研究[J].情报杂志,2015, 34(11):50-54,49.
[38] DING Y, ZHANG G, CHAMBERS T, et al. Content-based citation analysis:the next generation of citation analysis[J]. Journal of the association for information science and technology, 2014, 65(9):1820-1833.
[39] 赵蓉英,曾宪琴,陈必坤.全文本引文分析——引文分析的新发展[J].图书情报工作,2014,58(9):129-135.
[40] MIKOLOV T, CHEN K, CORRADO G, et al. Efficient estimation of word representations in vector space[EB/OL].[2019-12-12].https://arxiv.org/abs/1301.3781.
[41] MIKOLOV T, SUTSKEVER I, CHEN K, et al. Distributed representations of words and phrases and their compositionality[C]//BURGES C. Advances in neural information processing systems. Lake Tahoe:Neural Information Processing Systems Foundation, 2013:3136-3144.
[42] 唐晓波,翟夏普. 基于本体和Word2Vec的文本知识片段语义标引[J]. 情报科学,2019,37(4):97-102.
[43] LAW J, ZHUO H H, HE J, et al. LTSG:Latent topical skip-gram for mutually improving topic model and vector representations[C]//LAI J H. Pattern recognition and computer vision PRCV 2018. Cham:Springer, 2018:375-387.
[44] BLONDEL V D, GUILLAUME J, LAMBIOTTE R, et al. Fast unfolding of communities in large networks[J]. Journal of statistical mechanics:theory and experiment, 2008(10):P10008.
[45] UGANDER J, BACKSTROM L, MARLOW C, et al. Structural diversity in social contagion[C]//GRAHAM R L, JOLLA L. Proceedings of the national academy of sciences of the United States of America. Washington:PNAS, 2012:5962-5966.
[46] 苑彬成,方曙,刘合艳.作者共被引分析方法进展研究[J].图书情报工作,2009,53(22):80-84.
Outlines

/