Efficient Tracing for On-the-Fly Space-Time Displays in a Debugger for Message Passing Programs

  • Authors:
  • Robert Hood;Gregory Matthews

  • Affiliations:
  • -;-

  • Venue:
  • CCGRID '01 Proceedings of the 1st International Symposium on Cluster Computing and the Grid
  • Year:
  • 2001

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this work we describe the implementation of a practical mechanism for collecting and displaying trace information in a debugger for message passing programs. We introduce a trace format that is highly compressible while still providing information adequate for debugging purposes. We make the mechanism convenient for users to access by incorporating the trace collection in a set of wrappers for the MPI communication library. We implement several debugger operations that use the trace display: consistent stoplines, undo, and rollback. They all are implemented using controlled replay, which executes at full speed in target processes until the appropriate position in the computation is reached. They provide convenient mechanisms for getting to places in the execution where the full power of a state-based debugger can be brought to bear on isolating communication errors.