Chimera: AVirtual Data System for Representing, Querying, and Automating Data Derivation

  • Authors:
  • Ian T. Foster;Jens-S. Vöckler;Michael Wilde;Yong Zhao

  • Affiliations:
  • -;-;-;-

  • Venue:
  • SSDBM '02 Proceedings of the 14th International Conference on Scientific and Statistical Database Management
  • Year:
  • 2002

Quantified Score

Hi-index 0.00

Visualization

Abstract

Much scientific data is not obtained from measurements but rather derived from other data by the application of computational procedures. We hypothesize that explicit representation of these procedures can enable documentation of data provenance, discovery of available methods, and on-demand data generation (so-called "virtual data"). To explore this idea, we have developed the Chimera virtual data system, which combines a virtual data catalog, for representing data derivation procedures and derived data, with a virtual data language interpreter that translates user requests into data definition and query operations on the database. We couple the Chimera system with distributed "Data Grid" services to enable on-demand execution of computation schedules constructed from database queries. We have applied this system to two challenge problems, the reconstruction of simulated collision eventdata from a high-energy physics experiment, and the search of digital sky survey data for galactic clusters, with promising results.