理论研究

基于科学数据生命周期管理阶段的科学数据质量评价体系构建研究

  • 江洪 ,
  • 王春晓
展开
  • 1 中国科学院武汉文献情报中心, 武汉, 430071;
    2 中国科学院大学经济与管理学院图书情报与档案管理系, 北京, 100190
江洪(ORCID:0000-0003-3806-1856),副主任,研究员,硕士生导师,E-mail:jianghong@mail.whlib.ac.cn;王春晓(ORCID:0000-0002-2131-5111),硕士研究生。

收稿日期: 2019-10-29

  修回日期: 2020-02-09

  网络出版日期: 2020-05-20

Research on the Construction of Scientific Data Quality Evaluation System Based on Scientific Data Lifecycle Management Phases

  • Jiang Hong ,
  • Wang Chunxiao
Expand
  • 1 Wuhan Library, Chinese Academy of Sciences, Wuhan 430071;
    2 Department of Library, Information and Archives Management, School of Economics and Management, University of Chinese Academy of Sciences, Beijing 100190

Received date: 2019-10-29

  Revised date: 2020-02-09

  Online published: 2020-05-20

摘要

[目的/意义] 选取国内外15家科学数据中心的科学数据质量评价指标,旨在筛选能够客观反映科学数据质量的共性指标,构建具有普适性的科学数据质量评价指标体系。[方法/过程] 采用文案调查法、网络调查法和内容分析法,对15家科学数据中心的科学数据评价指标进行梳理和分析,了解现有的科学数据机构的数据评价指标。[结果/结论] 基于科学数据生命周期管理的各个阶段构建一套由数据管理计划、数据收集管理、数据分析与加工管理、数据保存管理和数据共享利用管理5个维度组成的科学数据质量评价指标模型,为我国和地方科学数据中心建立面向决策的科学数据中心评价系统提供参考。

本文引用格式

江洪 , 王春晓 . 基于科学数据生命周期管理阶段的科学数据质量评价体系构建研究[J]. 图书情报工作, 2020 , 64(10) : 19 -27 . DOI: 10.13266/j.issn.0252-3116.2020.10.003

Abstract

[Purpose/significance] The evaluation indexes of scientific data quality from 15 scientific data centers at home and abroad are mainly selected in order to screen the common indexes that can objectively reflect the quality of scientific data and build a universal evaluation index model of scientific data quality. [Method/process] By using the methods of document investigation, web survey and content analysis, the evaluation indexes of scientific data of 15 scientific data centers were sorted out, and the evaluation indexes of existing scientific data institutions were understood. [Result/conclusion] It structures a scientific data quality evaluation index framework based on 5 phases of data lifecycle management, which comprise data management plan, data collection management, data analysis and processing management, data storage management, data sharing and utilization management, and then provides a reference for the establishment of decision-oriented evaluation system of scientific data center in China and local scientific data centers.

参考文献

[1] NOAA.NOAA information quality guidelines[EB/OL].[2019-10-27].https://www.cio.noaa.gov/services_programs/IQ_Guidelines_103014.html.
[2] DANS.Evaluation of DANS EASY repository based on the FAIR Principles[EB/OL].[2019-10-27].https://dans.knaw.nl/en/about/organisation-and-policy/policy-and-strategy/EvaluationofDANSEASYbasedontheFAIRprinciples.pdf.
[3] WILKINSON M D,DUMONTIER M,AALBERSBERG I J,et al.Comment:the FAIR guiding principles for scientific data management and stewardship[J].Scientific data,2016,3:1-9.
[4] 胡聪.我国科学数据汇交管理现状、问题及对策研究[J].科技创业月刊,2019,32(7):81-84.
[5] STVILIA B,GASSER L,TWIDALE M B,et al.A framework for information quality assessment[J].Journal of the American Society for Information Science and Technology,2007,58(12):1720-1733.
[6] BATINI C,CAPPIELLO C,FRANCALANCI C,et al.Methodologies for data quality assessment and improvement[J].ACM computing surveys,2009,41(3):1-52.
[7] ZAVERI A,RULA A, MAURINO A. Quality assessment for linked data:a survey[J].Semantic Web,2016,7(1):63-93.
[8] KAHN M G,RAEBEL M A,GLANZ J M,et al. A pragmatic framework for single-site and multisite data quality assessment in electronic health record-based clinical research[J].Medical care,2012,50(7):S21-S29.
[9] CHEN H,HAILEY D,WANG N,et al. A review of data quality assessment methods for public health information systems[J].Informational journal of environmental research and public health,2014,11(5):5170-5207.
[10] HUANG H,STVILIA B,JOERGENSEN C,et al. Prioritization of data quality dimensions and skills requirements in genome annotation work[J].Journal of the American Society for Information Science and Technology,2012,63(1):195-207.
[11] 邵艳红.我国政府开放数据质量评价指标体系构建研究[D].保定:河北大学,2019.
[12] 李晓彤,翟军,郑贵福.我国地方政府开放数据的数据质量评价研究——以北京、广州和哈尔滨为例[J].情报杂志,2018,37(6):141-145.
[13] 刘桂锋,张裕,刘琼.科研数据开放平台评价指标体系构建及案例研究[J].图书情报知识,2019(1):21-31.
[14] 周宇,廖思琴,阮莉萍,等.数据监护平台评价指标体系构建与测定研究[J].图书馆学研究,2017(1):35-42.
[15] 余芳东.非传统数据质量评估的国际经验及借鉴[J].统计研究,2017,34(12):15-23.
[16] 余厚强,曹雪婷.替代计量数据质量评估体系构建研究[J].图书情报知识,2019(2):19-27,50.
[17] 马费成,望俊成.信息生命周期研究述评(Ⅰ)[J].情报学报,2010(5):939-947.
[18] 丁宁,马浩琴.国外高校科学数据生命周期管理模型比较研究及借鉴[J].图书情报工作,2013,57(6):18-22.
[19] DATAONE.Data life cycle[EB/OL].[2020-02-09].https://www.dataone.org/data-life-cycle.
[20] 张洋,肖燕珠.生命周期视角下《科学数据管理办法》解读及其启示[J].图书馆学研究,2019(15):37-43,13.
[21] 国务院办公厅.国务院办公厅关于印发科学数据管理办法的通知[EB/OL].[2019-10-26].http://www.gov.cn/zhengce/content/2018-04/02/content_5279272.htm.
[22] ORNL DAAC.Data management[EB/OL].[2019-10-25].https://daac.ornl.gov/datamanagement/.
[23] UCLA LIBRARY.Documentation and metadata overview[EB/OL].[2019-10-26].http://guides.library.ucla.edu/c.php?g=180580&p=1186345.
[24] ARM. Data management plan requirements[EB/OL].[2019-10-26].https://www.arm.gov/policies/datapolicies/digitalstatement.
[25] EROS.Data management plans[EB/OL].[2019-10-25].https://prd-wret.s3-us-west-2.amazonaws.com/assets/palladium/production/s3fs-public/atoms/files/DMStrategyTemplateVersion1.docx.
[26] USGS.Data management:stewardship[EB/OL].[2019-10-26]https://www.usgs.gov/products/data-and-tools/data-management/stewardship.
[27] UCLA LIBRARY.About the DMP tool[EB/OL].[2019-10-26]. http://guides.library.ucla.edu/c.php?g=180580&p=1190077.
[28] NEWCASTLE UNIVERSITY.Research data management:planning[EB/OL].[2019-10-26]. https://research.ncl.ac.uk/rdm/planning/dmp/writingadatamanagementplan/.
[29] 王丹丹.科学数据管理计划评价量表分析[J].图书情报工作,2017,61(18):35-41.
[30] DEEP BLUE DATA.Policy and terms of use[EB/OL].[2019-10-26].https://deepblue.lib.umich.edu/data/agreement#preservation_policy.
[31] UNIPROT. Guidelines for submitting updates or corrections to UniProtKB data[EB/OL].[2019-10-26].https://www.uniprot.org/help/submissions.
[32] DIGITAL CONSERVANCY.Policies and guidelines[EB/OL].[2019-10-26]. https://conservancy.umn.edu/pages/policies/#preservation.
[33] TEXAS DATA REPOSITORY.Digital preservation policy[EB/OL].[2019-10-25].https://texasdigitallibrary.atlassian.net/wiki/spaces/TDRUD/pages/291635428/Digital+Preservation+Policy.
[34] 国家基因组科学数据中心.Big standards for big omics data[EB/OL].[2019-10-26].https://bigd.big.ac.cn/standards.
[35] 国家基因库生命大数据平台.提交数据[EB/OL].[2019-10-26].https://db.cngb.org/cnsa/faq/#.
[36] 国家基因库生命大数据平台.审核数据[EB/OL].[2019-10-26].https://db.cngb.org/cnsa/faq/#.
[37] ORNL DAAC.Submit data[EB/OL].[2019-10-26] https://daac.ornl.gov/submit/#scope_and_acceptance_policy.
[38] TEXAS DATA REPOSITORY.Metadata dictionary[EB/OL].[2019-10-25].https://texasdigitallibrary.atlassian.net/wiki/spaces/TDRUD/pages/493551668/Metadata+Dictionary.
[39] UK DATA SERVICE. Quality assurance[EB/OL].[2019-10-26]. https://www.ukdataservice.ac.uk/manage-data/format/quality.aspx.
[40] DATAONE.Use data[EB/OL].[2019-10-26]. https://www.dataone.org/use-data.
[41] DIGITAL CONSERVANCY.About the data repository[EB/OL].[2019-10-26]. https://conservancy.umn.edu/pages/drum/.
[42] NOAA.NOAA information quality guidelines[EB/OL].[2019-10-25].https://www.cio.noaa.gov/services_programs/IQ_Guidelines_103014.html.
[43] USGS.Data management:process and analyze-closely related activities[EB/OL].[2019-10-25].https://www.usgs.gov/products/data-and-tools/data-management/process-and-analyze-closely-related-activities?qt-science_support_page_related_con=0#qt-science_support_page_related_con.
[44] TEXAS DATA REPOSITORY. Accessing and evaluating data[EB/OL].[2019-10-25].https://texasdigitallibrary.atlassian.net/wiki/spaces/TDRUD/pages/289243266/Accessing+and+Evaluating+Data.
[45] ARM.Data documentation[EB/OL].[2019-10-26]. https://www.arm.gov/policies/datapolicies/data-documentation.
[46] USGS.Data management:backup & secure[EB/OL].[2019-10-26].https://www.usgs.gov/products/data-and-tools/data-management/backup-secure#tools.
[47] TEXAS DATA REPOSITORY.Information security[EB/OL].[2019-10-25].https://texasdigitallibrary.atlassian.net/wiki/spaces/TDRUD/pages/292159828/Information+Security.
[48] NEWCASTLE UNIVERSITY.Research data management:working[EB/OL].[2019-10-26]. https://research.ncl.ac.uk/rdm/working/.
[49] USGS.Data management:archiving[EB/OL].[2019-10-26].https://www.usgs.gov/products/data-and-tools/data-management/archiving?qt-science_support_page_related_con=0#qt-science_support_page_related_con.
[50] 黄如花,邱春艳.国外科学数据共享研究综述[J].情报资料工作,2013(4):24-30.
[51] 邢文明,洪程.开放为常态,不开放为例外——解读《科学数据管理办法》中的科学数据共享与利用[J].图书馆论坛,2019(1):1-8.
[52] USGS.Data management:data release[EB/OL].[2019-10-26].https://www.usgs.gov/products/data-and-tools/data-management/data-release#elements.
[53] ICSU WORLD DATASYSTEM. Data sharing principles[EB/OL].[2019-10-26].http://www.icsu-wds.org/services/data-sharing-principles.
[54] TEXAS DATA REPOSITORY.Terms of use[EB/OL].[2019-10-25].https://texasdigitallibrary.atlassian.net/wiki/spaces/TDRUD/pages/289079299/Terms+of+Use.
文章导航

/