知识组织

科学数据存储库的发展态势与推进策略

  • 于会萍 ,
  • 宛玲
展开
  • 1. 河北大学管理学院 保定 071002;
    2. 华北电力大学图书馆 保定 071002
于会萍,副研究馆员,博士研究生。

收稿日期: 2022-02-14

  修回日期: 2022-05-08

  网络出版日期: 2022-08-17

基金资助

本文系河北省社会科学基金项目"开放数据环境下Altmetrics在科研成果评价中的应用研究"(项目编号:HB19TQ005)研究成果之一。

The Development Trend and Promotion Strategy of Science Data Repository

  • Yu Huiping ,
  • Wan Ling
Expand
  • 1. School of Management, Hebei University, Baoding 071002;
    2. Library of North China Electric Power University, Baoding 071002

Received date: 2022-02-14

  Revised date: 2022-05-08

  Online published: 2022-08-17

摘要

[目的/意义]基于Re3data平台元数据对科学数据存储库的发展态势进行多维度分析和讨论,在分析结果的基础上提出推进策略,为开放科学环境下科学数据存储基础设施的进一步发展提供决策参考。[方法/过程]通过API接口采集Re3data平台2 767条元数据并进行数据清洗和构建数据集,在此基础上对国内外存储库发展从基本情况、存储库管理、存储库服务3个层面进行多维度宏观观测和扫描,并辅以微观层面上相应存储库典型代表的个案描述,从而归纳出当前数据存储库面临的挑战,并提出推进策略。[结果/结论]科学数据存储库总量呈持续增长态势;科学数据存储库内容格式、存储库类型划分更加多样;存储库技术框架、元数据标准、数据服务形式呈现出多元形式并存的局面。这些多样性和多元化的发展态势同时也带来了一定的挑战。需要构建良性循环的数据共享激励生态体系;增强异构数据存储库平台的互操作性能;推动存储库学科层面元数据标准的规范化;加强存储库数据管理人员的培训和指导,以应对多元化态势带来的挑战。

本文引用格式

于会萍 , 宛玲 . 科学数据存储库的发展态势与推进策略[J]. 图书情报工作, 2022 , 66(15) : 107 -115 . DOI: 10.13266/j.issn.0252-3116.2022.15.011

Abstract

[Purpose/Significance] Based on the metadata of Re3data platform, this paper makes a multi-dimensional analysis on the development trend of scientific data repository, and puts forward promotion strategies on the basis of the analysis results, so as to provide decision-making references for the further development of the infrastructure of scientific data repository in an open science environment. [Method/Process] Collect 2767 pieces of metadata of Re3data platform through API interface, clean data and build data set. On the basis of the data set, the paper conducted a multi-dimensional macro observations and scans of the development of domestic and foreign repositories, including three levels of the basic situation of the repository, repository management, and repository services. In this process, the paper demonstrated the case description of the repository at the micro level. Then the paper summarized the challenges faced by current data repositories and proposed strategies for advancement. [Result/Conclusion] The total amount of scientific data repository is increasing continuously. The content formats and types of scientific data repositories are more diversified. The framework of repository technology, metadata standard and data service form coexist in multiple forms. These development trends in diversity also bring challenges. It is necessary to build a virtuous cycle of data sharing incentive ecological system, enhance interoperability of heterogeneous data repository platform, promote discipline standardization of repository metadata standards and strengthen the training and guidance of repository data managers to meet the challenges posed by diversity.

参考文献

[1] TENOPIR C, RICE N M, ALLARD S, et al. Data sharing,management, use, and reuse: practices and perceptions of scientists worldwide[J]. Plos one, 2020, 15(3): e0229003.
[2] Research Data Alliance. The Research Data Alliance (RDA) builds the social and technical bridges to enable the open sharing and re-use of data [EB/OL]. [2021-12-21]. https://rd-alliance.org/about-rda.
[3] 国务院办公厅. 国务院办公厅关于印发科学数据管理办法的通知 [EB/OL]. [2021-12-23]. http://www.gov.cn/zhengce/content/2018-04/02/content_5279272.htm.
[4] 国务院. 国务院关于构建更加完善的要素市场化配置体制机制的意见 [EB/OL]. [2021-12-23]. http://www.gov.cn/xinwen/2020-04/09/content_5500622.htm.
[5] UHLIR P F. Information gulags, intellectual straightjackets, and memory holes: three principles to guide the preservation of scientific data[J]. Data science journal, 2010: 1009200241-1009200241.
[6] MISGAR S M, BHAT A, WANI Z A. A study of open access research data repositories developed by BRICS countries[J]. Digital library perspectives, 2020, 38(1): 45-54.
[7] Tyerslab. The Biological General Repository for Interaction Datasets (BioGRID) [EB/OL]. [2021-12-22]. https://thebiogrid.org/.
[8] University of Southern Queensland. University of Southern Queensland Research Data Collection[EB/OL]. [2021-12-22]. https://researchdata.edu.au/contributors/university-of-southern-queensland.
[9] Australian National University. The Australian Data Archive (ADA)[EB/OL]. [2021-12-22]. https://ada.edu.au/.
[10] International Science Council.The World Data System (WDS) [EB/OL]. [2021-12-22]. https://www.worlddatasystem.org/organization/intro-to-wds.
[11] KIM Y. A study of the roles of metadata standard and data repository in science, technology, engineering and mathematics researchers' data reuse[J]. Online information review, 2021, 45(7): 1306-1321.
[12] KIM S, CHOI M S. Registry metadata quality assessment by the example of re3data. org schema[J]. International journal of knowledge content development & technology, 2017, 7(2): 41-51.
[13] AHAMMAD N. Quality control (QC) of an institutional repository: a hands-on[J]. Collection and curation, 2021, 40(4): 145-152.
[14] MAMTORA J, PANDEY P. Reframing research access[J]. Library management, 2021, 42(1/2): 70-79.
[15] SINGH S K, JENAMANI M. Cassandra-based data repository design for food supply chain traceability[J]. VINE journal of information and knowledge management systems, 2021, 51(2): 193-217.
[16] SICILIA M A, VISVIZI A. Blockchain and OECD data repositories: opportunities and policymaking implications[J]. Library hi tech, 2019, 37(1): 30-42.
[17] PAMPEL H, VIEKANT P, 顾立平,等. 呈现科研数据知识库:re3data.org注册机制[J]. 现代图书情报技术, 2014(3): 26-34.
[18] 王舒,黄国彬. 国外科学数据仓储的数据出版流程研究[J]. 数字图书馆论坛, 2021(1): 60-66.
[19] ARLITSCH K, WHEELER J, PHAM M T N. et al. An analysis of use and performance data aggregated from 35 institutional repositories[J]. Online information review, 2021, 45(2): 316-335.
[20] THOEGERSEN J L, BORLUND P. Researcher attitudes toward data sharing in public data repositories: a meta-evaluation of studies on researcher data sharing[J]. Journal of documentation, 2021, 78(7): 1-17.
[21] 吴思竹,李赞梅,崔佳伟,等. 基于全球研究数据注册仓储Re3data.org的医学科学数据仓储建设[J]. 中华医学图书情报杂志, 2018, 27(9): 20-31.
[22] 夏姚璜. 基于re3data的中美科学数据仓储对比研究[J]. 图书馆学研究, 2018(6): 17-26.
[23] 张莎莎,黄国彬,耿骞. 基于re3data的英国科学数据发布平台研究[J]. 数字图书馆论坛, 2017(6): 16-24.
[24] 王辉,Witt M. 基于re3data的科研数据仓储全景分析[J]. 图书情报工作, 2017, 61(22): 69-76.
[25] 扆铁梅,顾立平,董洁,等. 国外科学数据开放获取研究[M]. 北京: 北京大学出版社, 2017: 92.
[26] Re3data.org. Registry of Research Data Repositories[EB/OL]. [2021-12-26]. https://doi.org/10.17616/R3D.
[27] STRECKER D, BERTELMANN R. Metadata schema for the description of research data repositories: version 3.1[EB/OL]. [2021-12-26]. https://doi.org/10.48440/re3.010.
[28] CUAHSI Committees. The Consortium of Universities for the Advancement of Hydrologic Science (CUAHSI)[EB/OL]. [2021-12-28]. https://www.cuahsi.org/.
[29] Projet ALIPE. Multimodal learning corpus exchange[EB/OL]. [2021-12-29]. http://lrl-diffusion.univ-bpclermont.fr/mulce2/accesCorpus/accesCorpusMulce.php.
[30] COUSIJN H, BRAUKMANN R, FENNER M, et al. Connected research: the potential of the PID graph[J]. Patterns, 2021, 2(1): 100180.
[31] NYSED's Office of Information and Reporting Services. What does it mean if data on the New York State Report Cards are "embargoed"?[EB/OL]. [2022-01-09]. https://datasupport.nysed.gov/hc/en-us/articles/115000783823-What-does-it-mean-if-data-on-the-New-York-State-Report-Cards-are-embargoed-.
[32] University of North Carolina at Chapel Hill. Odum Institute Archive Dataverse[EB/OL]. [2022-01-10]. https://dataverse.unc.edu/dataverse/odum.
[33] 郑琳. 科研人员数据共享意愿及影响因素研究述评[J]. 图书馆理论与实践, 2018(9): 39-44,78.
[34] BORGMAN C L. The conundrum of sharing research data. advances in Information Science, 2012, 63(6): 1059-1078.
[35] US National Science Foundation. NSF financial requirements and payments [EB/OL]. [2022-01-10]. https://www.nsf.gov/pubs/policydocs/pappguide/nsf11001/aag_3.jsp.
[36] HANSON B, SUGDEN A, ALBERTS B. Making data maximally available[J]. Science, 2011, 331(6018): 649-649.
[37] STUART L, KIM S. Repository as a service (RaaS)[EB/OL]. [2022-01-15]. https://journals.tdl.org/jodi/index.php/jodi/article/view/5872.
[38] HONGBO S, LI Q, XIAOLIN Z, et al. Router service engine iswitch for open access articles: articles reception and resolving[J]. Data analysis and knowledge discovery, 2015, 31(6): 1-6.
[39] RVCKNAGEL J, VIERKANT P, ULRICH R, et al. Metadata schema for the description of research data repositories. Version 3.0[EB/OL]. [2021-12-29].https://gfzpublic.gfz-potsdam.de/rest/items/item_1397899_4/component/file_1398549/content.
[40] FANIEL I M, JACOBSEN T E. Reusing scientific data: how earthquake engineering researchers assess the reusability of colleagues' data[J]. Computer supported cooperative work, 2010, 19(3): 355-375.
[41] KIM Y. A study of the roles of metadata standard and data repository in science, technology, engineering and mathematics researchers' data reuse[J]. Online information review, 2021, 45(7): 1306-132.
[42] TENPIR C, RICE N M, ALLARD S, et al. Data sharing, management, use, and reuse: practices and perceptions of scientists worldwide[J]. Plos one, 2020, 15(3): e0229003.
[43] TENOPIR C, CHRISTIAN L, ALLARD S, et al. Research data sharing: practices and attitudes of geophysicists[J]. Earth and space science, 2018, 5(12): 891-902.
[44] TRIPATHI M, SHUKLA A, SONKAR S K. Research data management practices in university libraries: a study[J]. DESIDOC journal of library & information technology, 2017, 37(6): 417.
[45] TENOPIR C, SANDUSKY R J, ALLARD S, et al. Research data management services in academic research libraries and perceptions of librarians[J]. Library & information science research, 2014, 36(2): 84-90.
[46] 中国农业科学院作物科学研究所. 国家农作物种质资源平台[EB/OL]. [2022-01-15]. https://www.cgris.net/default.asp.
[47] 中国地震台网中心. 国家地震科学数据中心[EB/OL]. [2022-01-15]. https://data.earthquake.cn/gybz/info/2016/2312.html.
文章导航

/