Memory Trace Compression and Replay for SPMD Systems using Extended PRSDs?

  • Authors:
  • Sandeep Budanur;Frank Mueller;Todd Gamblin

  • Affiliations:
  • North Carolina State University, Raleigh, NC, USA;North Carolina State University, Raleigh, NC, USA;Lawrence Livermore National Laboratory, Livermore, CA, USA

  • Venue:
  • ACM SIGMETRICS Performance Evaluation Review - Special issue on the 1st international workshop on performance modeling, benchmarking and simulation of high performance computing systems (PMBS 10)
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Concurrency levels in large-scale supercomputers are rising exponentially, and shared-memory nodes with hundreds of cores and non-uniform memory access latencies are expected within the next decade. However, even current petascale systems with tens of cores per node suffer from memory bottlenecks. As core counts increase, memory issues become critical for the performance of large-scale supercomputers. Trace analysis tools are vital for diagnosing the root causes of memory problems. However, existing tools are expensive due to prohibitively large trace sizes, or they collect only statistical summaries that omit valuable information. In this paper, we present ScalaMemTrace, a novel technique for collecting memory traces in a scalable manner. ScalaMemTrace builds on prior trace methods with aggressive compression techniques to allow lossless representation of memory traces for dense algebraic kernels, with nearconstant trace size irrespective of the problem size or the number of threads. We further introduce a replay mechanism for ScalaMemTrace traces, and discuss the results of our prototype implementation on the x86 64 architecture.