Just in time: adding value to the IO pipelines of high performance applications with JITStaging

  • Authors:
  • Hasan Abbasi;Greg Eisenhauer;Matthew Wolf;Karsten Schwan;Scott Klasky

  • Affiliations:
  • Georgia Institute of Technology, Atlanta, GA, USA;Georgia Institute of Technology, Atlanta, GA, USA;Georgia Institute of Technology, Atlanta, GA, USA;Georgia Institute of Technology, Atlanta, GA, USA;Oak Ridge National Laboratory, Oak Ridge, TN, USA

  • Venue:
  • Proceedings of the 20th international symposium on High performance distributed computing
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Large scale applications are generating a tsunami of data, with understanding driven by finding information hidden within this data. The ever-increasing sizes of output, however, are making it difficult for science users to inspect the data generated by their applications, understand its important properties, and/or organize it for subsequent analysis and visualization. This paper presents JITStager, a software infrastructure with which end users can dynamically customize and thus, add value to the output pipelines of their HEC applications. JITStager is able to customize data at scale, by leveraging the computational power of both compute nodes and of additional `data staging' nodes allocated by end users. Using existing, componentized I/O interfaces to decouple the compile-time specification of the program and the run-time customization of the data pipeline, JITStager employs efficient runtime methods for binary code generation and data movement to create custom pipelines for applications' output processes that provide end users with improved insights into the data being produced, without burdening the application's computational performance and without impeding output performance. This paper describes the JITStager architecture, evaluates its performance, and demonstrates the advantages derived from its use with representative HPC applications.