[Purpose/significance] A citation context is a descriptive or evaluative sentence in scientific papers that contains references. By extracting and analyzing the cue words in the citation context, we are able to identify the citation behavior and motivations. [Method/process] Taking citation contexts in Journal of Informetrics as an example, we selected three kinds of commonly used cue words, such as personal pronouns, behavioral verbs and conjunctions, and respectively calculated the frequency, proportion and ranking in the citation context. By comparing the existence of cue words in the citation context and non-citation context, the common sentence pattern of citation contexts was characterized. [Result/conclusion] In Journal of Informetrics, the citation context mainly shows the following characteristics:a) focusing on the first-person and third-person perspectives, the citation context not only shows the works of others, but also shows the authors' own research; b) citing papers prefer to citing references involved research methods, and the most commonly used behavior verbs are"use" "base" and "study"; c) adversative and enumerative sentences are the main demonstration modes. The most used conjunctions are"also" and "but". In conclusion, the citation context analysis based on cue words has important value and significance for us to better understand the functions and motivations of citations in scientific papers.
Hu Zhigang
,
Sun Tai'an
,
Wang Xianwen
. Citation Context Analysis Based on Cue Words:A Case Study of Journal of Informetrics[J]. Library and Information Service, 2017
, 61(23)
: 25
-33
.
DOI: 10.13266/j.issn.0252-3116.2017.23.003
[1] BERTIN M, ATANASSOVA I, LARIVIERE V,et al. The distribution of references in scientific papers:an analysis of the IMRaD structure[EB/OL].[2016-11-20].https://www.researchgate.net/profile/Iana_Atanassova/publication/249652069_The_Distribution_of_References_in_Scientific_Papers_an_Analysis_of_the_IMRaD_Structure/links/0c96051e6649364120000000/The-Distribution-of-References-in-Scientific-Papers-an-Analysis-of-the-IMRaD-Structure.pdf.
[2] BERTIN M, ATANASSOVA I, SUGIMOTO C R,etal. The linguistic patterns and rhetorical structure of citation context:an approach using n-grams[J]. Scientometrics, 2016, 109(3):1417-1434.
[3] MERCER R E, DI MARCO C, KROON F W. The frequency of hedging cues in citation contexts in scientic writing[C]//17th conference of the Canadian Society for Computational Studies of Intelligence.Berlin:Springer, 2004:75-88.
[4] HYLAND K. Hedging in scientific research articles[M].Netherlands:John Benjamins Publishing, 1998.
[5] FINNEY B. The reference characteristics of scientific texts[D]. London:City University, 1979.
[6] GARZONE M, MERCER R E. Towards an automated citation classifier[C]//AI 2000.Berlin:Springer,2000:337-346.
[7] PHAM S B, HOFFMANN A. A new approach for scientific citation classification using cue phrases[C]//AI 2003.Berlin:Springer,2003:759-771.
[8] NANBA H, OKUMURA M. Towards multi-paper summarization using reference information[J]. Journal of natural language processing, 1999, 6(5):43-62.
[9] TEUFEL S, SIDDHARTHAN A, TIDHAR D. Automatic classification of citation function[C]//Proceedings of the 2006 conference on empirical methods in natural language processing.New York:ACM,2006:103-110.
[10] TEUFEL S. Argumentative zoning:information extraction from scientific text[D].Edinburgh:University of Edinburgh, 1999.
[11] RADOULOV R. Exploring automatic citation classification[D].Ontario:University of Wateloo, 2008.
[12] YU H, AGARWAL S, FRID N. Investigating and annotating the role of citation in biomedical full-text articles[C]//2009 IEEE International conference on bioinformatics and biomedicine.Piscataway:IEEE,2009:308-313.
[13] AGARWAL S, YU H. Automatically classifying sentences in full-text biomedical articles into introduction, methods, results and discussion[J]. Bioinformatics, 2009, 25(23):3174-80.
[14] AGARWAL S, CHOUBEY L, YU H. Automatically classifying the role of citations in biomedical articles[EB/OL].[2016-11-20].https://www.ncbi.nlm.nih.gov/pubmed/21346931.
[15] DING Y, ZHANG G, CHAMBERS T, et al. Content-based citation analysis:the next generation of citation analysis[J]. Journal of the Association for Information Science and Technology, 2014, 65(9):1820-1833.
[16] ZHANG G, DING Y, MILOJEVIC' S. Citation content analysis (CCA):a framework for syntactic and semantic analysis of citation content[J]. Journal of the American Society for Information Science and Technology, 2013, 64(7):1490-1503.
[17] JAMALI H R, NABAVI M. Open access and sources of full-text articles in Google Scholar in different subject fields[J]. Scientometrics, 2015, 105(3):1635-1651.
[18] LIU X, ZHANG J, GUO C. Full-text citation analysis:a new method to enhance scholarly networks[J]. Journal of the American Society for Information Science and Technology, 2013, 64(9):1852-1863.
[19] Nakov P I, Schwartz A S, Hearst M A. Citation sentences for semantic analysis of bioscience text[C]//27th annual international ACMSIGIR conference.New York:ACM,2004:13-17.
[20] KNOTT A. A data-driven methodology for motivating a set of coherence relations[D]. Edinburgh:University of Edinburgh, 1996.
[21] HIRSCH J E. An index to quantify an individual's scientific research output[J]. Proceedings of the National Academy of Sciences, 2005, 102(46):16569-16572.