MTC envelope: defining the capability of large scale computers in the context of parallel scripting applications

  • Authors:
  • Zhao Zhang;Daniel S. Katz;Michael Wilde;Justin M. Wozniak;Ian Foster

  • Affiliations:
  • University of Chicago, Chicago, IL, USA;University of Chicago & Argonne National Laboratory, Chicago, IL, USA;University of Chicago & Argonne National Laboratory, Chicago, IL, USA;Argonne National Laboratory, Argonne, IL, USA;University of Chicago, Chicago, IL, USA

  • Venue:
  • Proceedings of the 22nd international symposium on High-performance parallel and distributed computing
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

Many scientific applications can be efficiently expressed with the parallel scripting (many-task computing, MTC) paradigm. These applications are typically composed of several stages of computation, with tasks in different stages coupled by a shared file system abstraction. However, we often see poor performance when running these applications on large scale computers due to the applications' frequency and volume of filesystem I/O and the absence of appropriate optimizations in the context of parallel scripting applications. In this paper, we show the capability of existing large scale computers to run parallel scripting applications by first defining the MTC envelope and then evaluating the envelope by benchmarking a suite of shared filesystem performance metrics. We also seek to determine the origin of the performance bottleneck by profiling the parallel scripting applications' I/O behavior and mapping the I/O operations to the MTC envelope. We show an example shared filesystem envelope and present a method to predict the I/O performance given the applications' level of I/O concurrency and I/O amount. This work is instrumental in guiding the development of parallel scripting applications to make efficient use of existing large scale computers, and to evaluate performance improvements in the hardware/software stack that will better facilitate parallel scripting applications.