Measuring stability of feature ranking techniques: a noise-based approach

  • Authors:
  • Wilker Altidor;Taghi M. Khoshgoftaar;Amri Napolitano

  • Affiliations:
  • Department of Computer & Electrical Engineering & Computer Science, Florida Atlantic University, 777 Glades Road, Boca Raton, FL 33431, USA.;Department of Computer & Electrical Engineering & Computer Science, Florida Atlantic University, 777 Glades Road, Boca Raton, FL 33431, USA.;Department of Computer & Electrical Engineering & Computer Science, Florida Atlantic University, 777 Glades Road, Boca Raton, FL 33431, USA

  • Venue:
  • International Journal of Business Intelligence and Data Mining
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

One very common criterion used to evaluate feature selection methods is the performance of a chosen classifier trained with the selected features. Another important evaluation criterion that has, until recently, been neglected is the stability of these feature selection methods. While other studies have shown interest in measuring the degree of agreement between the outputs of a technique trained on randomly selected subsets from the same input data, this study presents the importance of evaluating stability in the presence of noise. Experiments are conducted with 17 filters (six standard filter-based ranking techniques and 11 threshold-based feature selection techniques) on nine different real-world datasets. This paper identifies the techniques that are inherently more sensitive to class noise and demonstrates how certain characteristics (sample size and class imbalance) of the data can affect the stability performance of some feature selection methods.