SST + gem5 = a scalable simulation infrastructure for high performance computing

  • Authors:
  • Mingyu Hsieh;Kevin Pedretti;Jie Meng;Ayse Coskun;Michael Levenhagen;Arun Rodrigues

  • Affiliations:
  • Sandia National Labs, Albuquerque, NM;Sandia National Labs, Albuquerque, NM;Boston University, Boston, MA;Boston University, Boston, MA;Sandia National Labs, Albuquerque, NM;Sandia National Labs, Albuquerque, NM

  • Venue:
  • Proceedings of the 5th International ICST Conference on Simulation Tools and Techniques
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

High Performance Computing (HPC) faces new challenges in scalability, performance, reliability, and power consumption. Solving these challenges will require radically new hardware and software approaches. It is impractical to explore this vast design space without detailed system-level simulations. However, most of the existing simulators are either not sufficiently detailed, not scalable, or cannot evaluate key system characteristics such as energy consumption or reliability. To address this problem, we integrate the highly detailed gem5 performance simulator into the parallel Structural Simulation Toolkit (SST). We add the fast-forwarding capability in the SST/gem5 and port the lightweight Kitten operating system on gem5. In addition, we improve the reliability model in SST with a comprehensive analysis of system reliability. Utilizing the simulation framework, we evaluate the impact of two energy-efficient resource-conscious scheduling policies on system reliability. Our results show that the effectiveness of scheduling policies differ according to the composition of workload and system topology.