Data product configuration management and versioning in large-scale production of satellite scientific data

  • Authors:
  • Bruce R. Barkstrom

  • Affiliations:
  • Atmospheric Sciences Data Center, NASA Langley Research Center, Hampton, VA

  • Venue:
  • SCM'01/SCM'03 Proceedings of the 2001 ICSE Workshops on SCM 2001, and SCM 2003 conference on Software configuration management
  • Year:
  • 2003

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper describes a formal structure for keeping track of files, source code, scripts, and related material for large-scale Earth science data production. We first describe the environment and processes that govern this configuration management problem. Then, we show that a graph with typed nodes and arcs can describe the derivation of production design and of the produced files and their metadata. The graph provides three useful by-products: • a hierarchical data file inventory structure that can help system users find particular files, • methods for creating production graphs that govern job scheduling and provenance graphs that track all of the sources and transformations between raw data input and a particular output file, •a systematic relationship between different elements of the structure and development documentation.