Is feature selection still necessary?

Authors:
Amir Navot;Ran Gilad-Bachrach;Yiftah Navot;Naftali Tishby
Affiliations:
Interdisciplinary Center for Neural Computation, The Hebrew University, Jerusalem, Israel;School of Computer Science and Engineering, The Hebrew University, Jerusalem, Israel;School of Computer Science and Engineering, The Hebrew University, Jerusalem, Israel;School of Computer Science and Engineering, The Hebrew University, Jerusalem, Israel
Venue:
SLSFS'05 Proceedings of the 2005 international conference on Subspace, Latent Structure and Feature Selection
Year:
2005

Citing 9
Cited 2

A training algorithm for optimal margin classifiers

COLT '92 Proceedings of the fifth annual workshop on Computational learning theory
A practical approach to feature selection

ML92 Proceedings of the ninth international workshop on Machine learning
Induction of Decision Trees

Machine Learning
An introduction to variable and feature selection

The Journal of Machine Learning Research
Margin based feature selection - theory and algorithms

ICML '04 Proceedings of the twenty-first international conference on Machine learning
Determination of the optimal number of features for quadratic discriminant analysis via the normal approximation to the discriminant distribution

Pattern Recognition
Learning with many irrelevant features

AAAI'91 Proceedings of the ninth National conference on Artificial intelligence - Volume 2
On Dimensionality, Sample Size, Classification Error, and Complexity of Classification Algorithm in Pattern Recognition

IEEE Transactions on Pattern Analysis and Machine Intelligence
A Problem of Dimensionality: A Simple Example

IEEE Transactions on Pattern Analysis and Machine Intelligence

The peaking phenomenon in the presence of feature-selection

Pattern Recognition Letters
Distribution-based anomaly detection via generalized likelihood ratio test: A general Maximum Entropy approach

Computer Networks: The International Journal of Computer and Telecommunications Networking

Quantified Score

Hi-index	0.00

Visualization

Abstract

Feature selection is usually motivated by improved computational complexity, economy and problem understanding, but it can also improve classification accuracy in many cases. In this paper we investigate the relationship between the optimal number of features and the training set size. We present a new and simple analysis of the well-studied two-Gaussian setting. We explicitly find the optimal number of features as a function of the training set size for a few special cases and show that accuracy declines dramatically by adding too many features. Then we show empirically that Support Vector Machine (SVM), that was designed to work in the presence of a large number of features produces the same qualitative result for these examples. This suggests that good feature selection is still an important component in accurate classification.