Life, death, and the critical transition: finding liveness bugs in systems code

  • Authors:
  • Charles Killian;James W. Anderson;Ranjit Jhala;Amin Vahdat

  • Affiliations:
  • University of California, San Diego;University of California, San Diego;University of California, San Diego;University of California, San Diego

  • Venue:
  • NSDI'07 Proceedings of the 4th USENIX conference on Networked systems design & implementation
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

Modern software model checkers find safety violations: breaches where the system enters some bad state. However, we argue that checking liveness properties offers both a richer and more natural way to search for errors, particularly in complex concurrent and distributed systems. Liveness properties specify desirable system behaviors which must be satisfied eventually, but are not always satisfied, perhaps as a result of failure or during system initialization. Existing software model checkers cannot verify liveness because doing so requires finding an infinite execution that does not satisfy a liveness property. We present heuristics to find a large class of liveness violations and the critical transition of the execution. The critical transition is the step in an execution that moves the system from a state that does not currently satisfy some liveness property--but where recovery is possible in the future--to a dead state that can never achieve the liveness property. Our software model checker, MACEMC, isolates complex liveness errors in our implementations of PASTRY, CHORD, a reliable transport protocol, and an overlay tree.