Improving distributed memory applications testing by message perturbation

  • Authors:
  • Richard Vuduc;Martin Schulz;Dan Quinlan;Bronis de Supinski;Andreas Sæbjørnsen

  • Affiliations:
  • Lawrence Livermore National Laboratory, Livermore, CA;Lawrence Livermore National Laboratory, Livermore, CA;Lawrence Livermore National Laboratory, Livermore, CA;Lawrence Livermore National Laboratory, Livermore, CA;University of Oslo, Oslo, Norway

  • Venue:
  • Proceedings of the 2006 workshop on Parallel and distributed systems: testing and debugging
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

We present initial work on perturbation techniques that cause the manifestation of timing-related bugs in distributed memory Message Passing Interface (MPI)-based applications. These techniques improve the coverage of possible message orderings in MPI applications that rely on nondeterministic point-to-point communication and work with small processor counts to alleviate the need to test at larger scales. Using carefully designed model problems,we show that these techniques aid testing for problems that are often not easily reproduced when running on small fractions of the machine.Our perturbation layer, JITTERBUG builds on PN MPI an extension of the MPI profiling interface that supports multiple layers of profiling libraries. We discuss how JITTERBUG complements existing MPI checking tools through the PN MPI framework.We present opportunities to build additional tools that statically analyze and directly transform the source code to support testing and debugging MPI applications at reduced scale.