An in-depth study of the potentially confounding effect of class size in fault prediction

  • Authors:
  • Yuming Zhou;Baowen Xu;Hareton Leung;Lin Chen

  • Affiliations:
  • Nanjing University;Nanjing University;Hong Kong Polytechnic University;Nanjing University

  • Venue:
  • ACM Transactions on Software Engineering and Methodology (TOSEM)
  • Year:
  • 2014

Quantified Score

Hi-index 0.00

Visualization

Abstract

Background. The extent of the potentially confounding effect of class size in the fault prediction context is not clear, nor is the method to remove the potentially confounding effect, or the influence of this removal on the performance of fault-proneness prediction models. Objective. We aim to provide an in-depth understanding of the effect of class size on the true associations between object-oriented metrics and fault-proneness. Method. We first employ statistical methods to examine the extent of the potentially confounding effect of class size in the fault prediction context. After that, we propose a linear regression-based method to remove the potentially confounding effect. Finally, we empirically investigate whether this removal could improve the prediction performance of fault-proneness prediction models. Results. Based on open-source software systems, we found: (a) the confounding effect of class size on the associations between object-oriented metrics and fault-proneness in general exists; (b) the proposed linear regression-based method can effectively remove the confounding effect; and (c) after removing the confounding effect, the prediction performance of fault prediction models with respect to both ranking and classification can in general be significantly improved. Conclusion. We should remove the confounding effect of class size when building fault prediction models.