Finding complex concurrency bugs in large multi-threaded applications

  • Authors:
  • Pedro Fonseca;Cheng Li;Rodrigo Rodrigues

  • Affiliations:
  • Max Planck Institute for Software Systems (MPI-SWS), Kaiserslautern and Saarbrücken, Germany;Max Planck Institute for Software Systems (MPI-SWS), Kaiserslautern and Saarbrücken, Germany;Max Planck Institute for Software Systems (MPI-SWS), Kaiserslautern and Saarbrücken, Germany

  • Venue:
  • Proceedings of the sixth conference on Computer systems
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Parallel software is increasingly necessary to take advantage of multi-core architectures, but it is also prone to concurrency bugs which are particularly hard to avoid, find, and fix, since their occurrence depends on specific thread interleavings. In this paper we propose a concurrency bug detector that automatically identifies when an execution of a program triggers a concurrency bug. Unlike previous concurrency bug detectors, we are able to find two particularly hard classes of bugs. The first are bugs that manifest themselves by subtle violation of application semantics, such as returning an incorrect result. The second are latent bugs, which silently corrupt internal data structures, and are especially hard to detect because when these bugs are triggered they do not become immediately visible. Pike detects these concurrency bugs by checking both the output and the internal state of the application for linearizability at the level of user requests. This paper presents this technique for finding concurrency bugs, its application in the context of a testing tool that systematically searches for such problems, and our experience in applying our approach to MySQL, a large-scale complex multi-threaded application. We were able to find several concurrency bugs in a stable version of the application, including subtle violations of application semantics, latent bugs, and incorrect error replies.