Debugging distributed programs using controlled re-execution

  • Authors:
  • Neeraj Mittal;Vijay K. Garg

  • Affiliations:
  • Department of Computer Sciences, The University of Texas at Austin, Austin, TX;Department of Electrical and Computer Engineering, The University of Texas at Austin, Austin, TX

  • Venue:
  • Proceedings of the nineteenth annual ACM symposium on Principles of distributed computing
  • Year:
  • 2000

Quantified Score

Hi-index 0.00

Visualization

Abstract

Distributed programs are hard to write. A distributed debugger equipped with the mechanism to re-execute the traced computation in a controlled fashion can greatly facilitate the detection and localization of bugs. This approach gives rise to a general problem, called predicate control problem, which takes a computation and a safety property specified on the computation, and outputs a controlled computation that maintains the property.We define a class of global predicates, called region predicates, that can be controlled efficiently in a distributed computation. We prove that the synchronization generated by our algorithm is optimal. Further, we introduce the notion of an admissible sequence of events and prove that it is equivalent to the notion of predicate control. We then give an efficient algorithm for the class of disjunctive predicates based on the notion of an admissible sequence.