Pipeline and Batch Sharing in Grid Workloads

  • Authors:
  • Douglas Thain;John Bent;Andrea C. Arpaci-Dusseau;Remzi H. Arpaci-Dusseau;Miron Livny

  • Affiliations:
  • -;-;-;-;-

  • Venue:
  • HPDC '03 Proceedings of the 12th IEEE International Symposium on High Performance Distributed Computing
  • Year:
  • 2003

Quantified Score

Hi-index 0.00

Visualization

Abstract

We present a study of six batch-pipelined scientific workloads that are candidates for execution on computational grids. Whereas other studies focus on the behavior of single applications, this study characterizes workloads composed of pipelines of sequential processes that use file storage for communication and also share significant data across a batch. This study includes measurements of the memory, CPU, and I/O requirements of individual components as well as analyses of I/O sharing within complete batches. We conclude with a discussion of the ramifications of these workloads for end-to-end scalability and overall system design.