Has the bug really been fixed?

  • Authors:
  • Zhongxian Gu;Earl T. Barr;David J. Hamilton;Zhendong Su

  • Affiliations:
  • University of California at Davis;University of California at Davis;University of California at Davis;University of California at Davis

  • Venue:
  • Proceedings of the 32nd ACM/IEEE International Conference on Software Engineering - Volume 1
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Software has bugs, and fixing those bugs pervades the software engineering process. It is folklore that bug fixes are often buggy themselves, resulting in bad fixes, either failing to fix a bug or creating new bugs. To confirm this folklore, we explored bug databases of the Ant, AspectJ, and Rhino projects, and found that bad fixes comprise as much as 9% of all bugs. Thus, detecting and correcting bad fixes is important for improving the quality and reliability of software. However, no prior work has systematically considered this bad fix problem, which this paper introduces and formalizes. In particular, the paper formalizes two criteria to determine whether a fix resolves a bug: coverage and disruption. The coverage of a fix measures the extent to which the fix correctly handles all inputs that may trigger a bug, while disruption measures the deviations from the program's intended behavior after the application of a fix. This paper also introduces a novel notion of distance-bounded weakest precondition as the basis for the developed practical techniques to compute the coverage and disruption of a fix. To validate our approach, we implemented Fixation, a prototype that automatically detects bad fixes for Java programs. When it detects a bad fix, Fixation returns an input that still triggers the bug or reports a newly introduced bug. Programmers can then use that bug-triggering input to refine or reformulate their fix. We manually extracted fixes drawn from real-world projects and evaluated Fixation against them: Fixation successfully detected the extracted bad fixes.