Combining in-situ and in-transit processing to enable extreme-scale scientific analysis

  • Authors:
  • Janine C. Bennett;Hasan Abbasi;Peer-Timo Bremer;Ray Grout;Attila Gyulassy;Tong Jin;Scott Klasky;Hemanth Kolla;Manish Parashar;Valerio Pascucci;Philippe Pebay;David Thompson;Hongfeng Yu;Fan Zhang;Jacqueline Chen

  • Affiliations:
  • Sandia National Laboratories;Oakridge National Laboratory;Lawrence Livermore National Laboratory;National Renewable Energy Laboratory;University of Utah;Rutgers University;Oakridge National Laboratory;Sandia National Laboratories;Rutgers University;University of Utah;Kitware;Kitware;Sandia National Laboratories;Rutgers University;Sandia National Laboratories

  • Venue:
  • SC '12 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

With the onset of extreme-scale computing, I/O constraints make it increasingly difficult for scientists to save a sufficient amount of raw simulation data to persistent storage. One potential solution is to change the data analysis pipeline from a post-process centric to a concurrent approach based on either in-situ or in-transit processing. In this context computations are considered in-situ if they utilize the primary compute resources, while in-transit processing refers to offloading computations to a set of secondary resources using asynchronous data transfers. In this paper we explore the design and implementation of three common analysis techniques typically performed on large-scale scientific simulations: topological analysis, descriptive statistics, and visualization. We summarize algorithmic developments, describe a resource scheduling system to coordinate the execution of various analysis workflows, and discuss our implementation using the DataSpaces and ADIOS frameworks that support efficient data movement between in-situ and in-transit computations. We demonstrate the efficiency of our lightweight, flexible framework by deploying it on the Jaguar XK6 to analyze data generated by S3D, a massively parallel turbulent combustion code. Our framework allows scientists dealing with the data deluge at extreme scale to perform analyses at increased temporal resolutions, mitigate I/O costs, and significantly improve the time to insight.