Large Scale Verification of MPI Programs Using Lamport Clocks with Lazy Update

  • Authors:
  • Anh Vo;Ganesh Gopalakrishnan;Robert M. Kirby;Bronis R. de Supinski;Martin Schulz;Greg Bronevetsky

  • Affiliations:
  • -;-;-;-;-;-

  • Venue:
  • PACT '11 Proceedings of the 2011 International Conference on Parallel Architectures and Compilation Techniques
  • Year:
  • 2011

Quantified Score

Hi-index 0.02

Visualization

Abstract

We propose a dynamic verification approach for large-scale message passing programs to locate correctness bugs caused by unforeseen nondeterministic interactions. This approach hinges on an efficient protocol to track the causality between nondeterministic message receive operations and potentially matching send operations. We show that causality tracking protocols that rely solely on logical clocks fail to capture all nuances of MPI program behavior, including the variety of ways in which nonblocking calls can complete. Our approach is hinged on formally defining the matches-before relation underlying the MPI standard, and devising lazy update logical clock based algorithms that can correctly discover all potential outcomes of nondeterministic receives in practice. can achieve the same coverage as a vector clock based algorithm while maintaining good scalability. LLCP allows us to analyze realistic MPI programs involving a thousand MPI processes, incurring only modest overheads in terms of communication bandwidth, latency, and memory consumption.