FlowChecker: Detecting Bugs in MPI Libraries via Message Flow Checking

  • Authors:
  • Zhezhe Chen;Qi Gao;Wenbin Zhang;Feng Qin

  • Affiliations:
  • -;-;-;-

  • Venue:
  • Proceedings of the 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Many MPI libraries have suffered from software bugs, which severely impact the productivity of a large number of users. This paper presents a new method called FlowChecker for detecting communication-related bugs inMPI libraries. The main idea is to extract program intentions of message passing (MPintentions), and to check whether theseMP-intentions are fulfilled correctly by the underlying MPI libraries, i.e., whether messages are delivered correctly from specified sources to specified destinations. If not, FlowChecker reports the bugs and provides diagnostic information. We have built a FlowChecker prototype on Linux and evaluated it with five real-world bug cases in three widely-used MPI libraries, including Open MPI, MPICH2, and MVAPICH2. Our experimental results show that FlowChecker effectively detects all five evaluated bug cases and provides useful diagnostic information. Additionally, our experiments with HPL and NPB show that FlowChecker incurs low runtime overhead (0.9-9.7% on three MPI libraries).