Enhancing software quality estimation using ensemble-classifier based noise filtering

  • Authors:
  • Taghi M. Khoshgoftaar;Shi Zhong;Vedang Joshi

  • Affiliations:
  • Department of Computer Science and Engineering, Florida Atlantic University, Boca Raton, FL 33431, USA. E-mail: {taghi, zhong, vjoshi}@cse.fau.edu;Department of Computer Science and Engineering, Florida Atlantic University, Boca Raton, FL 33431, USA. E-mail: {taghi, zhong, vjoshi}@cse.fau.edu;Department of Computer Science and Engineering, Florida Atlantic University, Boca Raton, FL 33431, USA. E-mail: {taghi, zhong, vjoshi}@cse.fau.edu

  • Venue:
  • Intelligent Data Analysis
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper presents a technique that improves the accuracy of classification models by enhancing the quality of training data. The idea is to eliminate instances that are likely to be noisy, and train classification models on "clean" data. Our approach uses 25 different classification techniques to create an ensemble classifier to filter noise. Using a relatively large number of base-level classifiers in the ensemble filter helps achieve different levels of desired noise removal conservativeness with several possible levels of filtering. It also provides a high degree of confidence in the noise elimination procedure as the results are less likely to get influenced by (possible) inappropriate learning bias of a few algorithms with 25 base-level classifiers than with a relatively smaller number of base-level classifiers. An empirical case study with software measurement data of a high assurance software project demonstrates the effectiveness of our noise elimination approach in improving classification accuracies. The similarities among predictions from the 25 classifiers are also investigated, and preliminary results suggest that the 25 classifiers may be effectively reduced to 13.