Towards logistic regression models for predicting fault-prone code across software projects

  • Authors:
  • Ana Erika Camargo Cruz;Koichiro Ochimizu

  • Affiliations:
  • Japan Institute of Science and Technology School of Information Science, 1-1 Asahidai, Nomi, Ishikawa;Japan Institute of Science and Technology School of Information Science, 1-1 Asahidai, Nomi, Ishikawa

  • Venue:
  • ESEM '09 Proceedings of the 2009 3rd International Symposium on Empirical Software Engineering and Measurement
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper, we discuss the challenge of making logistic regression models able to predict fault-prone object-oriented classes across software projects. Several studies have obtained successful results in using design-complexity metrics for such a purpose. However, our data exploration indicates that the distribution of these metrics varies from project to project, making the task of predicting across projects difficult to achieve. As a first attempt to solve this problem, we employed simple log transformations for making design-complexity measures more comparable among projects. We found these transformations useful in projects which data is not as spread as the data used for building the prediction model.