A Review of Scientific Data Semantic Description

  • Zhou Yu ,
  • Liao Siqin
  • Library of Southwest University of Science and Technology, Mianyang 621010

Received date: 2017-04-17

  Revised date: 2017-05-28

  Online published: 2017-06-20


[Purpose/significance] Based on research papers relevant to scientific data semantic description, this paper gives a review on the research status and hot topics in this area, points out the weakness of current research, and proposes some suggestions to the development of domestic scientific data semantic description.[Method/process] Based on research papers and dissertations published from January 2007 to December 2016 on scientific data semantic description from authoritative data sources, this paper investigates the research topics using comparative and inductive research methodand review relevant research achievements.[Result/conclusion] The research results include:①The development of scientific data semantic description is not balanced, and the majority of research achievements are focus on the field of Natural Science.②It is easier to get insight into essential features and differences of semantic description methods based on analysis dimension of annotation depth, processing level and manifestation.③There are still some shortcomings in the research on scientific data semantic description, and further studies should focus on dynamic data semantic description, visual retrieval, data integration and knowledge discovery.

Cite this article

Zhou Yu , Liao Siqin . A Review of Scientific Data Semantic Description[J]. Library and Information Service, 2017 , 61(12) : 136 -144 . DOI: 10.13266/j.issn.0252-3116.2017.12.018


[1] 孙建军.大数据时代人文社会科学如何发展[N].光明日报,2014-07-07(11).
[2] CUI Z, O'BRIEN P.Domain ontology management environment[C]//Hawaii international conference on system sciences.Hawaii:IEEE, 2000:9.
[3] Metadata for managing scientific research data[EB/OL].[2017-04-26].http://www.niso.org/news/events/2012/dcmi/scientific_data/.
[4] 黄如花,邱春艳.国内外科学数据元数据研究进展[J].图书与情报,2014(6):102-108.
[5] 周宇,欧石燕.国内数据监护平台研究热点与进展探析[J].图书情报工作,2016,60(22):116-125.
[6] 徐坤,蔚晓慧,毕强.基于数据本体的科学数据语义化组织研究[J].图书情报工作,2015,59(17):120-126.
[7] 周宇,廖思琴,阮莉萍,等.数据监护平台评价指标体系构建与测定研究[J].图书馆学研究,2017(1):35-42.
[8] LORD P, MACDONALD A.Data curation for e-Science in the UK:anaudittoestablishrequirements forfuturecurationand provision[EB/OL].[2017-03-16].http://digitalpreservation.us/news/2004/e-ScienceReportFinal.pdf.
[9] FOX P, MCGUINNESS D L, CINQUINI L, et al.Ontology-supported scientific data frameworks:the virtual solar-terrestrial observatory experience[J].Computers & geosciences, 2009, 35(4):724-738.
[10] 周波.高校科学数据元数据方案初探[J].图书馆学研究,2012(1):45-49.
[11] 徐坤.基于本体的科学数据监护平台研究——以高校医学科学数据为例[D].长春:吉林大学,2014.
[12] 马雨萌,郭进京,王昉.e-Science环境下科学数据语义组织模型框架研究[J].现代图书情报技术,2015(7):48-57.
[13] Data document initiative[EB/OL].[2016-03-16].http://www.ddialliance.org.
[14] HASSANZADEH O, YEGANEH S H, MILLER R J.Linking semistructureddata on the Web[C]//International workshop on the Web and databases 2011.Athens:DBLP, 2011.
[15] CHEN M, PLALE B.From metadata to ontology representation:a case of converting severe weather forecast metadata to an ontology[J].Proceedings of the American Society for Information Science and Technology,2012, 49(1):1-4.
[16] HASSANZADEH O, MILLER R J.Automatic curation of clinical trials data in LinkedCT[C]//The 14th international semantic Web conference.Berlin:Springer, 2015:270-278.
[17] QIN J, CHEN M, LIU X, et al.Linking entities in scientific metadata[C]//International conference on Dublin Core and metadata applications.Pittsburgh:Dublin core metadata initiative, 2010:128-136.
[18] KHAN H, CARUSO B, CORSON R J, et al.DataStaR:using the semantic web approach for data curation[J].International journal of digital curation, 2011, 6(2):209-221.
[19] LOWE B.DataStaR:bridging XML and OWL in science metadata management[C]//Research Conference on Metadata and Semantic Research-MTSR 2009.Berlin:Springer, 2009:141-150.
[20] WOLSTENCROFT K, OWEN S, HORRIDGE M, et al.Stealthy annotation of experimental biology by spreadsheets[J].Concurrency & computation practice &experience, 2013, 25(4):467-480.
[21] WILSON A, COX M, ELSBORG D, et al.A semantically enabled metadata repository for scientific data[J].Earth science informatics, 2015, 8(3):1-13.
[22] SILVA J R D, CASTRO J A, RIBEIRO C, et al.Dendro:collaborative research data management built on linked open data[C]//The semantic Web:ESWC 2014 satellite events.Berlin:Springer, 2014:483-487.
[23] TESTI D, VICECONTI M.PhysiomeSpace:digital library service for biomedical data[J].Philosophical transactions, 2010, 368(1921):2853-2861.
[24] SINGHAL A, SRIVASTAVA J.Generating semantic annotations for research datasets[C]//Proceedings of the 4th international conference on web intelligence, mining and semantics (WIMS14).New York:ACM, 2014:287-289.
[25] ASHBURNER M, BALL C A, BLAKE J A, et al.Gene ontology:tool for the unification of biology[J].Nature genetics, 2000, 25(1):25-29.
[26] LAMESCH P, BERARDINI T Z, LI D, et al.The arabidopsisinformation resource:improved gene annotation and new tools[J].Nucleic acids research, 2012, 40:D1202-D1210.
[27] HUNTLEY R P, SAWFORD T, MUTOWOMEULLENET P, et al.The GOA database:gene ontology annotation updates for 2015[J].Nucleic acids research, 2015, 43:D1057-D1063.
[28] BAIROCH A, APWEILER R, Wu C H, et al.The universal protein resource[J].Nucleic acids research, 2005, 36(S1):D154-D159.
[29] PANG C, SOLLIE A, SIJTSMA A, et al.SORTA:a system for ontology-based re-coding and technical annotation of biomedical phenotype data[J].Database:the journal of biological databases &curation, 2015(1):1-13.
[30] FU G,BATCHELORR C, DUMONTIER M, et al.PubChemRDF:towards the semantic annotation of PubChem compound and substance databases[J].Journal of cheminformatics, 2015, 7(1):34-48.
[31] SAMWALD M, JENTZSCH A, BOUTON C, et al.Linked open drug data for pharmaceutical research and development[J].Journal of heminformatics, 2011, 3(1):19-24.
[32] 庄倩,常颖聪,何琳,等.基于关联数据的科学数据组织研究[J].情报理论与实践,2016,39(5):22-26.
[33] HASNAIN A, KAMDAR M R, HASAPIS P, et al.Linked biomedical dataspace:lessons learned integrating data for drug discovery[C]//International semantic Web conference.New York:Springer-Verlag, 2014:114-130.
[34] KOZÁK J, NEASKČ M, DDEK J, et al.Using linked data for better navigation in summaries of product characteristics[C]//International workshop on semantic Web applications and tools for life sciences.New York:ACM,2013:44-59.
[35] KAZAK J, NECASKY M, POKORNY J.Drug encyclopedia-linked data application for physicians[C]//The 14th international semantic Web conference.Berlin:Springer,2015:41-56.
[36] FALCONER S.NCBO resource index:ontology-based search and mining of biomedical resources[J].Web semantics science services & agents on the World Wide Web, 2011, 9(3):316-324.
[37] RUTTENBERG A, REES J A, SAMWALD M, et al.Life sciences on the semantic Web:the neurocommons and beyond[J].Briefings in bioinformatics, 2009, 10(2):193-204.
[38] MADIN J, BOWERS S, SCHILDHAUER M et al.An ontology for describing and synthesizing ecological observation data[J].Ecologicalinformatics, 2007(2):279-296.
[39] BERKLEY C, BOWERS S, JONES M B, et al.Improving data discovery for metadata repositories through semantic search[C]//International conference on complex, intelligent and software intensive systems.Fukuoka:IEEE, 2009:1152-1159.
[40] CAO H P, BOWERS S, SCHILDHAUER M P.Approaches for semantically annotating and discovering scientific observational data[C]//International conference on database and expert systems applications.Berlin:Springer, 2011:526-541.
[41] SUN G Y, CHRISTOPHER S G K.Modelling questionnaire survey data to support data curation[C]//Proceedings of the 6th international conference on Asia-Pacific library and information education and practice.Manila:University of the Philippines, 2015:196-211.
[42] BISCHOF S, MARTIN C, POLLERES A, et al.Collecting, integrating, enriching and republishing open city data as linked data[C]//International conference on the semantic Web-ISWC 2015.Berlin:Springer, 2015:58-75.
[43] KHUSRO S, LATIF A, ULLAH I.On methods and tools of table detection, extraction and annotation in PDF documents[J].Journal of information science, 2015, 41(1):41-57.
[44] TAKIS J, ISLAM A S, LANGE C.Crowdsourced semantic annotation of scientific publications and tabular data in PDF[C]//Proceedings of the 11th international conference on semantic systems.Vienna:ACM, 2015:1-8.
[45] TAO C, EMBLEY D W.Automatic hidden-web table interpretation, conceptualization, and semantic annotation[J].Data & knowledge engineering, 2009, 68(7):683-703.
[46] 沈志宏,刘筱敏,郭学兵,等.关联数据发布流程与关键问题研究——以科技文献、科学数据发布为例[J].中国图书馆学报,2013,39(2):53-62.
[47] BHAGAVATULA C S, NORASET T, DOWNEY D.TabEL:entity linking in web tables[C]//The 14th international semantic Web conference.Berlin:Springer, 2015:425-441.
[48] ZHANG Z.Towards efficient and effective semantic table interpretation[C]//International semantic Web conference.New York:Springer-Verlag,2014:487-502.
[49] MARTIN M, NUFFELEN B, ABRUZZINI S, et al.The digital agenda scoreboard:a statistical anatomy of Europe's way into the information age[R].Leipzig:University of Leipzig, 2012:1-5.
[50] HOFFNER K, MARTIN M, LEHMANN J.LinkedSpending:open spending becomes linked open data[J].Semantic Web, 2014,7(1):95-104.
[51] RUBACK L, PESCE M, MANSO S, et al.A mediator for statistical linked data[C]//ACM Symposium on applied computing.New York:ACM, 2013:339-341.
[52] SALAS P E R, MARTIN M, MOTA F M D, et al.OLAP2DataCube:an ontowiki plug-in for statistical data publishing[C]//The workshop ondeveloping tools as plug-ins.Washington DC:IEEE, 2012:79-83.
[53] HELMICH B J.Analysing and visualizing statistical linked data[D].Prague:Charles University, 2013.
[54] HAUSENBLAS M, HALB W, RAIMOND Y, et al.SCOVO:using statistics on the web of data[C]//6th European semantic Web conference(ESWC2009).Berlin:Springer, 2009:708-722.
[55] KOHO M, HYYONEN E, LEHIKOINEN A.Ornithology based on linking bird observations with weather data[C]//The semantic Web:ESWC 2014 satellite events.Berlin:Springer, 2014:75-85.
[56] GROZA T, ZANKL A, LI Y F, et al.Using semantic web technologies to build a community-driven knowledge curation platform for the skeletal dysplasia domain[M].Berlin:Springer, 2011:81-96.
[57] GUDIVADA R C, QU X Y A, JEGGA A G, et al.A genome-phenome integrated approach for mining disease-causal genes using semantic web[C]//HCLS Workshop, WWW 2007.Banff:ACM,2007:23-35.
[58] CHEN B, YING D, WILD D J.Assessing drug target association using semantic linked data[J].Ploscomputational biology, 2012, 8(7):e1002574.
[59] 徐潇洁,何琳,陈雅玲,等.面向关联数据的科学实验数据语义描述模型研究——以水稻基因实验为例[J].图书馆,2017(1):61-66.
[60] WILSON J A J,MARTINEZ U L,FRASER M A,et al.An institutional approach to developing research data management infrastructure[J].International journal of digital curation,2011,6(2):274-287.
[61] KAMDAR M R, ZEGINIS D, HASNAIN A, et al.ReVeaLD:a user-driven domain-specific interactive search platform for biomedical research[J].Journal of biomedical informatics, 2013, 47(2):112-130.
