Improving Data Access for Computational Grid Applications

  • Authors:
  • Ron Oldfield;David Kotz

  • Affiliations:
  • Scalable Computing Systems, Sandia National Laboratories, Albuquerque 87185-1110;Department of Computer Science, Dartmouth College, 6211 Sudikoff Laboratory, Hanover 03755

  • Venue:
  • Cluster Computing
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

High-performance computing increasingly occurs on "computational grids" composed of heterogeneous and geographically distributed systems of computers, networks, and storage devices that collectively act as a single "virtual" computer. A key challenge in this environment is to provide efficient access to data distributed across remote data servers. Our parallel I/O framework, called Armada, allows application and data-set providers to flexibly compose graphs of processing modules that describe the distribution, application interfaces, and processing required of the dataset before computation. Although the framework provides a simple programming model for the application programmer and the data-set provider, the resulting graph may contain bottlenecks that prevent efficient data access. In this paper, we present an algorithm used to restructure Armada graphs that distributes computation and data flow to improve performance in the context of a wide-area computational grid.