Empirical evaluation of feature selection methods in classification

Authors:
Luka Č/ehovin;Zoran Bosnić/
Affiliations:
(Correspd. Tel.: +386 1 4768189/ E-mail: luka.cehovin@fri.uni-lj.si) Faculty of Computer and Information Science, University of Ljubljana, Ljubljana, Slovenia;Faculty of Computer and Information Science, University of Ljubljana, Ljubljana, Slovenia
Venue:
Intelligent Data Analysis
Year:
2010

Citing 18
Cited 0

Estimating attributes: analysis and extensions of RELIEF

ECML-94 Proceedings of the European conference on machine learning on Machine Learning
The nature of statistical learning theory

The nature of statistical learning theory
Feature Selection: Evaluation, Application, and Small Sample Performance

IEEE Transactions on Pattern Analysis and Machine Intelligence
Selection of relevant features and examples in machine learning

Artificial Intelligence - Special issue on relevance
Wrappers for feature subset selection

Artificial Intelligence - Special issue on relevance
Random Forests

Machine Learning
A Comparative Study on Feature Selection in Text Categorization

ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
Classifier-Independent Feature Selection For Two-Stage Feature Selection

SSPR '98/SPR '98 Proceedings of the Joint IAPR International Workshops on Advances in Pattern Recognition
Theoretical and Empirical Analysis of ReliefF and RReliefF

Machine Learning
An introduction to variable and feature selection

The Journal of Machine Learning Research
Benchmarking Attribute Selection Techniques for Discrete Class Data Mining

IEEE Transactions on Knowledge and Data Engineering
Introduction to Machine Learning (Adaptive Computation and Machine Learning)

Introduction to Machine Learning (Adaptive Computation and Machine Learning)
Toward Integrating Feature Selection Algorithms for Classification and Clustering

IEEE Transactions on Knowledge and Data Engineering
Computational Methods of Feature Selection (Chapman & Hall/Crc Data Mining and Knowledge Discovery Series)

Computational Methods of Feature Selection (Chapman & Hall/Crc Data Mining and Knowledge Discovery Series)
Statistical Comparisons of Classifiers over Multiple Data Sets

The Journal of Machine Learning Research
A review of feature selection techniques in bioinformatics

Bioinformatics
Machine Learning and Data Mining: Introduction to Principles and Algorithms

Machine Learning and Data Mining: Introduction to Principles and Algorithms
LIBSVM: A library for support vector machines

ACM Transactions on Intelligent Systems and Technology (TIST)

Quantified Score

Hi-index	0.00

Visualization

Abstract

In the paper, we present an empirical evaluation of five feature selection methods: ReliefF, random forest feature selector, sequential forward selection, sequential backward selection, and Gini index. Among the evaluated methods, the random forest feature selector has not yet been widely compared to the other methods. In our evaluation, we test how the implemented feature selection can affect (i.e. improve) the accuracy of six different classifiers by performing feature selection. The results show that ReliefF and random forest enabled the classifiers to achieve the highest increase in classification accuracy on the average while reducing the number of unnecessary attributes. The achieved conclusions can advise the machine learning users which classifier and feature selection method to use to optimize the classification accuracy, which may be important especially in risk-sensitive applications of Machine Learning (e.g. medicine, business decisions, control applications) as well as in the aim to reduce costs of collecting, processing and storage of unnecessary data.