Streaming feature selection using alpha-investing

Authors:
Jing Zhou;Dean Foster;Robert Stine;Lyle Ungar
Affiliations:
University of Pennsylvania, Philadelphia, PA;University of Pennsylvania, Philadelphia, PA;University of Pennsylvania, Philadelphia, PA;University of Pennsylvania, Philadelphia, PA
Venue:
Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
Year:
2005

Citing 3
Cited 3

Relational Data Mining

Relational Data Mining
Cluster-based concept invention for statistical relational learning

Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Learning statistical models from relational data

Proceedings of the 2011 ACM SIGMOD International Conference on Management of data

Streamwise Feature Selection

The Journal of Machine Learning Research
Variational learning of a Dirichlet process of generalized Dirichlet distributions for simultaneous clustering and feature selection

Pattern Recognition
Online group feature selection

IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence

Quantified Score

Hi-index	0.00

Visualization

Abstract

In Streaming Feature Selection (SFS), new features are sequentially considered for addition to a predictive model. When the space of potential features is large, SFS offers many advantages over traditional feature selection methods, which assume that all features are known in advance. Features can be generated dynamically, focusing the search for new features on promising subspaces, and overfitting can be controlled by dynamically adjusting the threshold for adding features to the model. We describe α-investing, an adaptive complexity penalty method for SFS which dynamically adjusts the threshold on the error reduction required for adding a new feature. α-investing gives false discovery rate-style guarantees against overfitting. It differs from standard penalty methods such as AIC, BIC or RIC, which always drastically over- or under-fit in the limit of infinite numbers of non-predictive features. Empirical results show that SFS is competitive with much more compute-intensive feature selection methods such as stepwise regression, and allows feature selection on problems with over a million potential features.