On Feature Selection, Bias-Variance, and Bagging

Authors:
M. Arthur Munson;Rich Caruana
Affiliations:
Cornell University, Ithaca NY, USA 14850;Microsoft Corporation,
Venue:
ECML PKDD '09 Proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases: Part II
Year:
2009

Citing 17
Cited 5

Neural networks and the bias/variance dilemma

Neural Computation
Coding Decision Trees

Machine Learning
Bagging predictors

Machine Learning
Error reduction through learning multiple descriptions

Machine Learning
Wrappers for feature subset selection

Artificial Intelligence - Special issue on relevance
The Random Subspace Method for Constructing Decision Forests

IEEE Transactions on Pattern Analysis and Machine Intelligence
Feature selection for ensembles

AAAI '99/IAAI '99 Proceedings of the sixteenth national conference on Artificial intelligence and the eleventh Innovative applications of artificial intelligence conference innovative applications of artificial intelligence
An Empirical Comparison of Voting Classification Algorithms: Bagging, Boosting, and Variants

Machine Learning
Combining Nearest Neighbor Classifiers Through Multiple Feature Subsets

ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
A Unifeid Bias-Variance Decomposition and its Applications

ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Ensemble Methods in Machine Learning

MCS '00 Proceedings of the First International Workshop on Multiple Classifier Systems
An introduction to variable and feature selection

The Journal of Machine Learning Research
Benefitting from the variables that variable selection discards

The Journal of Machine Learning Research
Overfitting in making comparisons between variable selection methods

The Journal of Machine Learning Research
A Bias-Variance Analysis of a Real World Learning Problem: The CoIL Challenge 2000

Machine Learning
Robust Feature Selection Using Ensemble Feature Selection Techniques

ECML PKDD '08 Proceedings of the European conference on Machine Learning and Knowledge Discovery in Databases - Part II
Practical Bias Variance Decomposition

AI '08 Proceedings of the 21st Australasian Joint Conference on Artificial Intelligence: Advances in Artificial Intelligence

Bootstrap feature selection for ensemble classifiers

ICDM'10 Proceedings of the 10th industrial conference on Advances in data mining: applications and theoretical aspects
A novel stability based feature selection framework for k-means clustering

ECML PKDD'11 Proceedings of the 2011 European conference on Machine learning and knowledge discovery in databases - Volume Part II
Tomographic considerations in ensemble bias/variance decomposition

MCS'10 Proceedings of the 9th international conference on Multiple Classifier Systems
A variance reduction framework for stable feature selection

Statistical Analysis and Data Mining
Feature selection for k-means clustering stability: theoretical analysis and an algorithm

Data Mining and Knowledge Discovery

Quantified Score

Hi-index	0.00

Visualization

Abstract

We examine the mechanism by which feature selection improves the accuracy of supervised learning. An empirical bias/variance analysis as feature selection progresses indicates that the most accurate feature set corresponds to the best bias-variance trade-off point for the learning algorithm. Often, this is not the point separating relevant from irrelevant features, but where increasing variance outweighs the gains from adding more (weakly) relevant features. In other words, feature selection can be viewed as a variance reduction method that trades off the benefits of decreased variance (from the reduction in dimensionality) with the harm of increased bias (from eliminating some of the relevant features). If a variance reduction method like bagging is used, more (weakly) relevant features can be exploited and the most accurate feature set is usually larger. In many cases, the best performance is obtained by using all available features.