Integrated in-system storage architecture for high performance computing

  • Authors:
  • Dries Kimpe;Kathryn Mohror;Adam Moody;Brian Van Essen;Maya Gokhale;Rob Ross;Bronis R. de Supinski

  • Affiliations:
  • Argonne National Laboratory, Argonne, IL;Lawrence Livermore National Laboratory, Livermore, CA;Lawrence Livermore National Laboratory, Livermore, CA;Lawrence Livermore National Laboratory, Livermore, CA;Lawrence Livermore National Laboratory, Livermore, CA;Argonne National Laboratory, Argonne, IL;Lawrence Livermore National Laboratory, Livermore, CA

  • Venue:
  • Proceedings of the 2nd International Workshop on Runtime and Operating Systems for Supercomputers
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

In-system solid state storage is expected to be an important component of the I/O subsystem on the first exascale platforms, as it has the potential to reduce DRAM requirements, to increase system reliability, and to smooth I/O loads. This paper describes the design of a prototype, integrated in-system storage architecture that we are developing to serve the diverse needs of high performance computing. Our container abstraction will provide lightweight management of in-system storage devices, as well as methods to access containers remotely and to transfer them within the storage hierarchy. We are also working on a storage hierarchy abstraction API to provide portable HPC I/O software with the critical information on the configuration of the system on which it is running. As currently available large-scale HPC systems lack in-system storage, we are developing a solid state storage simulator backed by DRAM. We are integrating these efforts around an I/O-intensive workload provided by the scalable checkpoint/restart (SCR) library. We expect our efforts to reduce the overheads of checkpointing and data movement across the system and thus to improve the scalability and reliability of HPC applications.