[Purpose/Significance] This paper analyzes the use characteristics of scientific datasets from the perspective of quantitative analysis and content analysis, quantitatively evaluates the impact of scientific datasets on discipine development, and provides references for scientific data management services and policy research.[Method/Process] Methods of text mining and bibliometric were used to analyze the full-text literature in PubMed Central, this study comprehensively investigated the use of scientific datasets from 7 aspects such as time distribution and use intensity, and on this basis, evaluated the actual impact of scientific datasets on discipline development.[Result/Conclusion] The research results show that the influence of scientific datasets on scientific research in the biomedical field is increasing with each passing day. Data publishing and high-level journals promote the opening and sharing of scientific datasets. The use of scientific datasets is concentrated in the second half of the paper and there are few formal references. The corresponding standards and specifications need to be further strengthened.
Yang Ning
,
Zhang Zhiqiang
. Research on the Use Characteristics of Scientific Datasets Combined with Quantitative Analysis and Content Analysis[J]. Library and Information Service, 2022
, 66(10)
: 122
-130
.
DOI: 10.13266/j.issn.0252-3116.2022.010.011
[1] 屈宝强, 王凯. 科学数据引用现状和研究进展[J]. 情报理论与实践,2016,39(5):118-138.
[2] 朱少强, 邱均平. 文献计量与内容分析——文献群中隐含信息的挖掘[J]. 图书情报工作,2005(6):19-23.
[3] BELTER C W, BROWMAN H I. Measuring the value of research data:a citation analysis of oceanographic data sets[J]. Plos one,2014,9(3):e92590.
[4] 焦红, 杨波, 周琪. 生物医学领域科学数据集复用特征研究[J]. 情报理论与实践,2021,44(9):90-96.
[5] 王曰芬, 路菲, 吴小雷. 文献计量和内容分析的比较与综合研究[J]. 图书情报工作,2005,49(9):72-75.
[6] 王雪, 马胜利, 佘曾溧, 等. 科学数据的引用行为及其影响力研究[J]. 情报学报,2016,35(11):1132-1139.
[7] 李龙飞, 余厚强, 尹梓涵, 等. 替代计量学视角下科学数据集价值的定量测度研究[J]. 情报理论与实践,2020,43(9):47-52,71.
[8] 沈锡宾, 吕小东, 郝秀原, 等. PubMed Central简介及其对期刊的评估和收录[J]. 中国科技期刊研究,2006,17(5):866-868.
[9] 沈锡宾, 顾佳, 包婧玲, 等. 美国NLM DTD 3.0期刊存储和交换标签集中参考文献的标记解读[J]. 中国科技期刊研究,2013,24(2):233-237.
[10] NCBI. Gene expression omnibus[EB/OL].[2021-07-12]. https://www.ncbi.nlm.nih.gov/geo/.
[11] NCBI. Reference sequence database[EB/OL].[2021-07-12]. https://www.ncbi.nlm.nih.gov/refseq/.
[12] NCBI. Sequence read archive[EB/OL].[2021-07-12]. https://trace.ncbi.nlm.nih.gov/Traces/sra/.
[13] NCBI. Conserved domains database[EB/OL].[2021-07-12]. https://www.ncbi.nlm.nih.gov/cdd/.
[14] NCBI. Assembly[EB/OL].[2021-07-12]. https://www.ncbi.nlm.nih.gov/assembly/.
[15] WAN X, LIU F. WL-index:leveraging citation mention number to quantify an individual's scientific impact[J]. Journal of the American Society for Information Science & Technology,2014,65(12):2509-2517.
[16] DING Y, LIU X, GUO C, et al. The distribution of references across texts:some implications for citation analysis[J]. Journal of informetrics,2013,7(3):583-592.
[17] WANG B B, BREN DE L V. The asrg database:identification and survey of arabidopsis thaliana genes involved in pre-mRNA splicing[J]. Genome biology,2004,5(12):1-23.
[18] MEEREIS F, KAUFMANN M. Pcogr:phylogenetic cog ranking as an online tool to judge the specificity of cogs with respect to freely definable groups of organisms[J]. BMC bioinformatics,2004,5(1):150-150.
[19] 屈宝强, 王凯. 数据论文的出现与发展[J]. 图书与情报,2015(5):1-8.
[20] 中国科学院文献情报中心. 中国科学院文献情报中心期刊分区表[EB/OL].[2021-07-12]. http://www.fenqubiao.com/.
[21] LI J, JIN K, LI M, et al. A host cell long noncoding RNA nr_033736 regulates type I interferon-mediated gene transcription and modulates intestinal epithelial anti-cryptosporidium defense[J]. Plos Pathogens,2021,17(1):e1009241.
[22] LIN L, EVANS S. Structural patterns in empirical research articles:a cross-disciplinary study[J]. English for specific purposes,2012,31(3):150-160.
[23] 胡志刚. 全文引文分析方法与应用[M]. 北京:科学出版社,2017.
[24] 章成志, 李卓, 赵梦圆, 等. 基于引文内容的中文图书被引行为研究[J]. 中国图书馆学报,2019,45(3):96-109.
[25] 张梦莹, 卢超, 郑茹佳, 等. 用于引文内容分析的标准化数据集构建[J]. 图书馆论坛,2016(8):48-53.
[26] CHI P S. Differing disciplinary citation concentration patterns of book and journal literature?[J]. Journal of informetrics,2016,10(3):814-829.