Enhancing the core scientific metadata model to incorporate derived data

  • Authors:
  • Erica Yang;Brian Matthews;Michael Wilson

  • Affiliations:
  • Bodleian Libraries, University of Oxford, Oxford OX2 0EW, UK and STFC e-Science, Rutherford Appleton Laboratory, HSIC, Didcot, OX11 0QX, UK;STFC e-Science, Rutherford Appleton Laboratory, HSIC, Didcot, OX11 0QX, UK;STFC e-Science, Rutherford Appleton Laboratory, HSIC, Didcot, OX11 0QX, UK

  • Venue:
  • Future Generation Computer Systems
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

Much of the value in scientific data is provided not only in the raw data but through the analysis of that data to derive published results. A study of the data analysis process for structural science has shown that various data sets derived from the raw data are of use to scientists and should be stored with the raw data. The Core Scientific MetaData model (CSMD) is used by a number of large scientific facilities to catalogue scientific data. The current version provides support to experimental scientists to access their raw data, facility managers for accounting for facility usage and other scientists who wish to re-use raw experimental data. In this paper, extensions to the CSMD are presented to describe the analysis process so that the provenance of the derived data can be captured. A pilot implementation incorporating derived data through this extended CSMD model has been trialled with experimental scientists. Remaining challenges to the adoption of CSMD and the tools it supports are considered.