Empirical Assessment of Machine Learning based Software Defect Prediction Techniques

  • Authors:
  • Venkata U. B. Challagulla;Farokh B. Bastani;I-Ling Yen;Raymond A. Paul

  • Affiliations:
  • Department of Computer Science, University of Texas at Dallas, TX;Department of Computer Science, University of Texas at Dallas, TX;Department of Computer Science, University of Texas at Dallas, TX;OASD/C31/Y2K Department of Defense

  • Venue:
  • WORDS '05 Proceedings of the 10th IEEE International Workshop on Object-Oriented Real-Time Dependable Systems
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

The wide-variety of real-time software systems, including telecontrol/telepresence systems, robotic systems, and mission planning systems, can entail dynamic code synthesis based on runtime mission-specific requirements and operating conditions. This necessitates the need for dynamic dependability assessment to ensure that these systems will perform as specified and will not fail in catastrophic ways. One approach in achieving this is to dynamically assess the modules in the synthesized code using software defect prediction techniques. Statistical models, such as Stepwise Multi-linear Regression models and multivariate models, and machine learning approaches, such as Artificial Neural Networks, Instance-based Reasoning, Bayesian-Belief Networks, Decision Trees, and Rule Inductions, have been investigated for predicting software quality. However, there is still no consensus about the best predictor model for software defects. In this paper, we evaluate different predictor models on four different real-time software defect data sets. The results show that a combination of 1R and Instance-based Learning along with the Consistencybased Subset Evaluation technique provides relatively better consistency in accuracy prediction compared to other models. The results also show that "size" and "complexity" metrics are not sufficient for accurately predicting real-time software defects.