A training algorithm for optimal margin classifiers
COLT '92 Proceedings of the fifth annual workshop on Computational learning theory
A practical approach to feature selection
ML92 Proceedings of the ninth international workshop on Machine learning
Machine Learning
An introduction to variable and feature selection
The Journal of Machine Learning Research
Margin based feature selection - theory and algorithms
ICML '04 Proceedings of the twenty-first international conference on Machine learning
Learning with many irrelevant features
AAAI'91 Proceedings of the ninth National conference on Artificial intelligence - Volume 2
IEEE Transactions on Pattern Analysis and Machine Intelligence
A Problem of Dimensionality: A Simple Example
IEEE Transactions on Pattern Analysis and Machine Intelligence
The peaking phenomenon in the presence of feature-selection
Pattern Recognition Letters
Computer Networks: The International Journal of Computer and Telecommunications Networking
Hi-index | 0.00 |
Feature selection is usually motivated by improved computational complexity, economy and problem understanding, but it can also improve classification accuracy in many cases. In this paper we investigate the relationship between the optimal number of features and the training set size. We present a new and simple analysis of the well-studied two-Gaussian setting. We explicitly find the optimal number of features as a function of the training set size for a few special cases and show that accuracy declines dramatically by adding too many features. Then we show empirically that Support Vector Machine (SVM), that was designed to work in the presence of a large number of features produces the same qualitative result for these examples. This suggests that good feature selection is still an important component in accurate classification.