图书情报工作 ›› 2020, Vol. 64 ›› Issue (10): 109-117.DOI: 10.13266/j.issn.0252-3116.2020.10.012

• 知识组织 • 上一篇    下一篇

科研实体名称规范的关联数据模型构建

周毅, 张建勇, 刘峥, 刘秀敏   

  1. 中国科学院文献情报中心, 北京, 100190
  • 收稿日期:2019-11-25 修回日期:2020-02-19 出版日期:2020-05-20 发布日期:2020-05-20
  • 作者简介:周毅(ORCID:0000-0002-1494-6716),馆员,硕士;张建勇(ORCID:0000-0001-7533-1726),研究馆员,硕士;刘峥(ORCID:0000-0002-2494-436X),副研究馆员,博士,通讯作者,E-mail:liuz@mail.las.ac.cn;刘秀敏(ORCID:0000-0001-6014-9614),馆员,硕士。
  • 基金资助:
    本文系国家科技图书文献中心(NSTL)资助项目"名称规范数据库建设"(项目编号:科1817)研究成果之一。

Research on the Construction of Linked Data Model for Research Entity's Name Authority Data

Zhou Yi, Zhang Jianyong, Liu Zheng, Liu Xiumin   

  1. National Science Library, Chinese Academy of Sciences, Beijing 100190
  • Received:2019-11-25 Revised:2020-02-19 Online:2020-05-20 Published:2020-05-20

摘要: [目的/意义] 旨在研究将国家科技图书文献中心(National Science and Technology Library,NSTL)的科研实体名称规范数据发布为关联数据的难点——关联数据的数据模型。科研实体名称规范数据的数据模型研究,有助于NSTL科研实体数据的共享、互联、质量提升,融入到互联网中,同时也为其他机构使用、发布关联数据提供模型参考。[方法/过程] 首先,分析比较国内外关联数据发布项目中所采用的数据模型,发现关联数据发布项目中的数据模型主要分为以Schema.org为核心和多种标准词表组合两类;结合NSTL名称规范数据的特点,设计两种形式的关联数据模型,并从关联数据模型对名称规范数据的表达程度、模型复杂度等角度进行比较,选择较优方案;最后以D2RQ为工具进行实验,将NSTL名称规范的样例数据发布为关联数据。[结果/结论] 分析发现两种方案中以Schema.org为核心标准词表的方案相对于多种标准词表组合的方案有较优的表达完整度、较低的模型复杂度,更易于融入互联网,因此更适合作为NSTL名称规范数据的关联数据模型。

关键词: 科研实体, 名称规范, 关联数据, 数据模型

Abstract: [Purpose/significance] The purpose of this paper is to study the linked data model of publishing the NSTL’s research entity name authority data as linked data. After the name authority data is published as linked data, it can be reused as an open linked data set by other system or organization, and also can be better integrated with other linked data sets to improve data quality. In addition, it also provides a model building reference for other organizations to publish authority data as linked data. [Method/process] First, this paper analyzed and compared the data models used in the linked data publishing projects at home and abroad. It showed that the data models in the linked data publishing projects were mainly divided into two categories. Then, combined with the characteristics of NSTL name authority data, two forms of linked data models were designed. It compared the two models from the expression level of the NSTL’s data and the complexity of the models. The better one was selected. Finally, it used D2RQ as tool to publish the sample data as linked data. [Result/conclusion] The analysis found that the model with Schema.org as the core standard vocabulary has better performance. So it is more suitable as a linked data model for NSTL’s name authority data.

Key words: research entities, name authority, linked data, data model

中图分类号: