Trace: parallel trace replay with approximate causal events

  • Authors:
  • Michael P. Mesnier;Matthew Wachs;Raja R. Sambasivan;Julio Lopez;James Hendricks;Gregory R. Ganger;David O'Hallaron

  • Affiliations:
  • Intel Research with Carnegie Mellon University;Carnegie Mellon University;Carnegie Mellon University;Carnegie Mellon University;Carnegie Mellon University;Carnegie Mellon University;Carnegie Mellon University

  • Venue:
  • FAST '07 Proceedings of the 5th USENIX conference on File and Storage Technologies
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

//TRACE is a new approach for extracting and replaying traces of parallel applications to recreate their I/O behavior. Its tracing engine automatically discovers internode data dependencies and inter-I/O compute times for each node (process) in an application. This information is reflected in per-node annotated I/O traces. Such annotation allows a parallel replayer to closely mimic the behavior of a traced application across a variety of storage systems. When compared to other replay mechanisms, //TRACE offers significant gains in replay accuracy. Overall, the average replay error for the parallel applications evaluated in this paper is below 6%.