Grafting: fast, incremental feature selection by gradient descent in function space

  • Authors:
  • Simon Perkins;Kevin Lacker;James Theiler

  • Affiliations:
  • Space and Remote Sensing Sciences, Los Alamos National Laboratory, Los Alamos, NM;Department of Computer Science, University of California, Berkeley, CA;Space and Remote Sensing Sciences, Los Alamos National Laboratory, Los Alamos, NM

  • Venue:
  • The Journal of Machine Learning Research
  • Year:
  • 2003

Quantified Score

Hi-index 0.00

Visualization

Abstract

We present a novel and flexible approach to the problem of feature selection, called grafting. Rather than considering feature selection as separate from learning, grafting treats the selection of suitable features as an integral part of learning a predictor in a regularized learning framework. To make this regularized learning process sufficiently fast for large scale problems, grafting operates in an incremental iterative fashion, gradually building up a feature set while training a predictor model using gradient descent. At each iteration, a fast gradient-based heuristic is used to quickly assess which feature is most likely to improve the existing model, that feature is then added to the model, and the model is incrementally optimized using gradient descent. The algorithm scales linearly with the number of data points and at most quadratically with the number of features. Grafting can be used with a variety of predictor model classes, both linear and non-linear, and can be used for both classification and regression. Experiments are reported here on a variant of grafting for classification, using both linear and non-linear models, and using a logistic regression-inspired loss function. Results on a variety of synthetic and real world data sets are presented. Finally the relationship between grafting, stagewise additive modelling, and boosting is explored.