Empirical evaluation of feature selection methods in classification

  • Authors:
  • Luka Č/ehovin;Zoran Bosnić/

  • Affiliations:
  • (Correspd. Tel.: +386 1 4768189/ E-mail: luka.cehovin@fri.uni-lj.si) Faculty of Computer and Information Science, University of Ljubljana, Ljubljana, Slovenia;Faculty of Computer and Information Science, University of Ljubljana, Ljubljana, Slovenia

  • Venue:
  • Intelligent Data Analysis
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

In the paper, we present an empirical evaluation of five feature selection methods: ReliefF, random forest feature selector, sequential forward selection, sequential backward selection, and Gini index. Among the evaluated methods, the random forest feature selector has not yet been widely compared to the other methods. In our evaluation, we test how the implemented feature selection can affect (i.e. improve) the accuracy of six different classifiers by performing feature selection. The results show that ReliefF and random forest enabled the classifiers to achieve the highest increase in classification accuracy on the average while reducing the number of unnecessary attributes. The achieved conclusions can advise the machine learning users which classifier and feature selection method to use to optimize the classification accuracy, which may be important especially in risk-sensitive applications of Machine Learning (e.g. medicine, business decisions, control applications) as well as in the aim to reduce costs of collecting, processing and storage of unnecessary data.