Driving scientific applications by data in distributed environments

  • Authors:
  • Joel Saltz;Umit Catalyurek;Tahsin Kurc;Mike Gray;Shannon Hastings;Steve Langella;Sivaramakrishnan Narayanan;Ryan Martino;Steven Bryant;Malgorzata Peszynska;Mary Wheeler;Alan Sussman;Michael Beynon;Christian Hansen;Don Stredney;Dennis Sessanna

  • Affiliations:
  • Department of Biomedical Informatics, The Ohio State University;Department of Biomedical Informatics, The Ohio State University;Department of Biomedical Informatics, The Ohio State University;Department of Biomedical Informatics, The Ohio State University;Department of Biomedical Informatics, The Ohio State University;Department of Biomedical Informatics, The Ohio State University;Department of Biomedical Informatics, The Ohio State University;Center for Subsurface Modeling, The University of Texas at Austin;Center for Subsurface Modeling, The University of Texas at Austin;Center for Subsurface Modeling, The University of Texas at Austin;Center for Subsurface Modeling, The University of Texas at Austin;Department of Computer Science, University of Maryland;Department of Computer Science, University of Maryland;Department of Computer Science, University of Maryland;Interface Laboratory, The Ohio Supercomputer Center;Interface Laboratory, The Ohio Supercomputer Center

  • Venue:
  • ICCS'03 Proceedings of the 2003 international conference on Computational science
  • Year:
  • 2003

Quantified Score

Hi-index 0.01

Visualization

Abstract

Traditional simulation-based applications for exploring a parameter space to understand a physical phenomenon or to optimize a design are rapidly overwhelmed by data volume when large numbers of simulations of different parameters are carried out. Optimizing reservoir management through simulation-based studies, in which large numbers of realizations are sought using detailed geologic descriptions, is an example of such applications. In this paper, we describe a software architecture to facilitate large scale simulation studies, involving ensembles of long-running simulations and analysis of vast volumes of output data. This architecture is built on top of two frameworks we have developed: IPARS and DataCutter. These frameworks make it possible to implement tools and applications to run large-scale simulatios, and generate and investigate terabyte-scale datasets efficiently.