Do too many cooks spoil the broth? Using the number of developers to enhance defect prediction models

  • Authors:
  • Elaine J. Weyuker;Thomas J. Ostrand;Robert M. Bell

  • Affiliations:
  • AT&T Labs - Research, Florham Park, USA 07932;AT&T Labs - Research, Florham Park, USA 07932;AT&T Labs - Research, Florham Park, USA 07932

  • Venue:
  • Empirical Software Engineering
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Fault prediction by negative binomial regression models is shown to be effective for four large production software systems from industry. A model developed originally with data from systems with regularly scheduled releases was successfully adapted to a system without releases to identify 20% of that system's files that contained 75% of the faults. A model with a pre-specified set of variables derived from earlier research was applied to three additional systems, and proved capable of identifying averages of 81, 94 and 76% of the faults in those systems. A primary focus of this paper is to investigate the impact on predictive accuracy of using data about the number of developers who access individual code units. For each system, including the cumulative number of developers who had previously modified a file yielded no more than a modest improvement in predictive accuracy. We conclude that while many factors can "spoil the broth" (lead to the release of software with too many defects), the number of developers is not a major influence.