Massively parallel loading

  • Authors:
  • Wolfgang Frings;Dong H. Ahn;Matthew LeGendre;Todd Gamblin;Bronis R. de Supinski;Felix Wolf

  • Affiliations:
  • Forschungszentrum Juelich, Juelich, Germany;Lawrence Livermore National Laboratory, Livermore, CA, USA;Lawrence Livermore National Laboratory, Livermore, CA, USA;Lawrence Livermore National Laboratory, Livermore, CA, USA;Lawrence Livermore National Laboratory, Livermore, CA, USA;German Research School for Simulation Sciences/ RWTH Aachen University, Aachen, Germany

  • Venue:
  • Proceedings of the 27th international ACM conference on International conference on supercomputing
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

Dynamic linking has many advantages for managing large code bases, but dynamically linked applications have not typically scaled well on high performance computing systems. Splitting a monolithic executable into many dynamic shared object (DSO) files decreases compile time for large codes, reduces runtime memory requirements by allowing modules to be loaded and unloaded as needed, and allows common DSOs to be shared among many executables. However, launching an executable that depends on many DSOs causes a flood of file system operations at program start-up, when each process in the parallel application loads its dependencies. At large scales, this operation has an effect similar to a site-wide denial-of-service attack, as even large parallel file systems struggle to service so many simultaneous requests. In this paper, we present SPINDLE, a novel approach to parallel loading that coordinates simultaneous file system operations with a scalable network of cache server processes. Our approach is transparent to user applications. We extend the GNU loader, which is used in Linux as well as proprietary operating systems, to limit the number of simultaneous file system operations, quickly loading DSOs without thrashing the file system. Our experiments show that our prototype implementation has a low overhead and increases the scalability of Pynamic, a benchmark that stresses the dynamic loader, by a factor of 20.