Efficient dynamic verification algorithms for mpi applications

  • Authors:
  • Ganesh Gopalakrishnan;Robert M. Kirby;Sarvani Vakkalanka

  • Affiliations:
  • The University of Utah;The University of Utah;The University of Utah

  • Venue:
  • Efficient dynamic verification algorithms for mpi applications
  • Year:
  • 2010

Quantified Score

Hi-index 0.02

Visualization

Abstract

The Message Passing Interface (MPI) Application Programming Interface (API) is widely used in almost all high performance computing applications. Yet, conventional debugging tools for MPI suffer from two serious drawbacks: they cannot prevent the exponentially growing number of redundant schedules from being explored; and they cannot prevent the processes from being locked into a small subset of schedules, unfortunately often reaching the potentially buggy schedules only when programs are ported to new platforms.Dynamic verification methods are the natural choice for debugging real world MPI programs when model extraction and maintenance are expensive. While many dynamic verification tools exist for verifying shared memory programs, there are no corresponding tools that support MPI – the lingua franca of parallel programming.While interleaving reduction suggests the use of dynamic partial order reduction (DPOR), four aspects of MPI make previous DPOR algorithms inapplicable: (i) MPI contains asynchronous calls that can complete out of program order; (ii) MPI has global synchronization operations that have weak semantics; (iii) the runtime of MPI cannot, without intrusive modifications, be forced to pursue a specific interleaving with nondeterministic wildcard receives; and (iv) the progress of MPI operations can depend on platform-dependent runtime buffering, making bugs sometimes appear when resources are added to boost performance. This dissertation provides a formal model for MPI, and introduces a tailor-made notion of Happens-Before ordering for MPI functions. The crucial feature of this Happens-Before relation is that it elegantly solves all these four problems. MPI dynamic analysis is turned into a prioritized scheduling algorithm respecting MPI’s Happens-Before. This dissertation contributes three algorithms that have been demonstrated in the context of a practical MPI dynamic verification tool called In-Situ Partial order (ISP). The Partial Order avoiding Elusive Interleavings (POE) algorithm is a simple prioritized execution of the MPI transitions and is guaranteed to find all deadlocks, assertion violations and resource leaks under zero buffering. The POEOPT algorithm avoids many of the redundant interleavings of POE by fully exploiting MPI’s Happens-Before. Finally, the POEMSE algorithm discovers all possible minimal runtime bufferings that guarantee to discover bugs. POEMSE’s slack analysis has minimal overheads, and offers the power of verifying for safe portability by considering all relevant bufferings that might exist in various platforms. In effect, a program is dynamically verified not just with respect to the platform on which the tool is run, but also with respect to all platforms.