Applying Chimera virtual data concepts to cluster finding in the Sloan Sky Survey

  • Authors:
  • James Annis;Yong Zhao;Jens Voeckler;Michael Wilde;Steve Kent;Ian Foster

  • Affiliations:
  • Experimental Astrophysics, Fermilab, Batavia, IL;University of Chicago, Chicago, IL;University of Chicago, Chicago, IL;Argonne National Laboratory, Argonne, IL;Experimental Astrophysics, Fermilab, Batavia, IL;University of Chicago, Chicago, IL and Argonne National Laboratory, Argonne, IL

  • Venue:
  • Proceedings of the 2002 ACM/IEEE conference on Supercomputing
  • Year:
  • 2002

Quantified Score

Hi-index 0.00

Visualization

Abstract

In many scientific disciplines -- especially long running, data-intensive collaborations -- it is important to track all aspects of data capture, production, transformation, and analysis. In principle, one can then audit, validate, reproduce, and/or re-run with corrections various data transformations. We have recently proposed and prototyped the Chimera virtual data system, a new database-driven approach to this problem. We present here a major application study in which we apply Chimera to a challenging data analysis problem: the identification of galaxy clusters within the Sloan Digital Sky Survey. We describe the problem, its computational procedures, and the use of Chimera to plan and orchestrate the workflow of thousands of tasks on a data grid comprising hundreds of computers. This experience suggests that a general set of tools can indeed enhance the accuracy and productivity of scientific data reduction and that further development and application of this paradigm will offer great value.