A Comparative Study on the Use of Correlation Coefficients for Redundant Feature Elimination

  • Authors:
  • Pablo A. Jaskowiak;Ricardo J. G. B. Campello;Thiago F. Covoes;Eduardo R. Hruschka

  • Affiliations:
  • -;-;-;-

  • Venue:
  • SBRN '10 Proceedings of the 2010 Eleventh Brazilian Symposium on Neural Networks
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Simplified Silhouette Filter (SSF) is a recently introduced feature selection method that automatically estimates the number of features to be selected. To do so, a sampling strategy is combined with a clustering algorithm that seeks clusters of correlated (potentially redundant) features. It is well known that the choice of a similarity measure may have great impact in clustering results. As a consequence, in this application scenario, this choice may have great impact in the feature subset to be selected. In this paper we study six correlation coefficients as similarity measures in the clustering stage of SSF, thus giving rise to several variants of the original method. The obtained results show that, in particular scenarios, some correlation measures select fewer features than others, while providing accurate classifiers.