Improving Tree-Based Models of Software Quality with Principal Components Analysis

  • Authors:
  • Taghi M. Khoshgoftaar;Ruqun Shan;Edward B. Allen

  • Affiliations:
  • -;-;-

  • Venue:
  • ISSRE '00 Proceedings of the 11th International Symposium on Software Reliability Engineering
  • Year:
  • 2000

Quantified Score

Hi-index 0.00

Visualization

Abstract

Software-quality classification models can predict which modules will be considered fault-prone, or not, based on software product metrics, process metrics, and execution metrics. Such predictions can be used to target improvement efforts to those modules that need it the most. Classification-tree modeling is a robust technique for building such software quality models. However, model structure maybe unstable and accuracy may suffer when predictors are highly correlated. This paper presents an empirical case study of four releases of a very large telecommunications system, which showed that the tree-based models could be improved by transforming the predictors with principal components analysis, so that transformed predictors are not correlated. The case study used the regression-tree algorithm in the S-Plus package and then applied our general decision rule to classify modules.