A training algorithm for optimal margin classifiers
COLT '92 Proceedings of the fifth annual workshop on Computational learning theory
Regularization theory and neural networks architectures
Neural Computation
Wrappers for feature subset selection
Artificial Intelligence - Special issue on relevance
Improved Boosting Algorithms Using Confidence-rated Predictions
Machine Learning - The Eleventh Annual Conference on computational Learning Theory
Advances in Large Margin Classifiers
Advances in Large Margin Classifiers
A Practical Approach to Feature Selection
ML '92 Proceedings of the Ninth International Workshop on Machine Learning
Filters, Wrappers and a Boosting-Based Hybrid for Feature Selection
ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Feature Selection and Dualities in Maximum Entropy Discrimination
UAI '00 Proceedings of the 16th Conference on Uncertainty in Artificial Intelligence
An introduction to variable and feature selection
The Journal of Machine Learning Research
Gradient LASSO for feature selection
ICML '04 Proceedings of the twenty-first international conference on Machine learning
Online feature selection for pixel classification
ICML '05 Proceedings of the 22nd international conference on Machine learning
Boosting-based parse reranking with subtree features
ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Advances in discriminative parsing
ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Feature Selection via Coalitional Game Theory
Neural Computation
Kernel discriminant analysis based feature selection
Neurocomputing
Fast Optimization Methods for L1 Regularization: A Comparative Study and Two New Approaches
ECML '07 Proceedings of the 18th European conference on Machine Learning
Learning to Combine Motor Primitives Via Greedy Additive Regression
The Journal of Machine Learning Research
Large-scale sparse logistic regression
Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Fast full parsing by linear-chain conditional random fields
EACL '09 Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics
Using modified Lasso regression to learn large undirected graphs in a probabilistic framework
AAAI'05 Proceedings of the 20th national conference on Artificial intelligence - Volume 2
Feature selection for activity recognition in multi-robot domains
AAAI'08 Proceedings of the 23rd national conference on Artificial intelligence - Volume 3
Computational challenges in parsing by classification
CHSLP '06 Proceedings of the Workshop on Computationally Hard Problems and Joint Inference in Speech and Language Processing
Feature selection based on the Shapley value
IJCAI'05 Proceedings of the 19th international joint conference on Artificial intelligence
Speech recognition using augmented conditional random fields
IEEE Transactions on Audio, Speech, and Language Processing
Selective enhancement learning in competitive learning
IJCNN'09 Proceedings of the 2009 international joint conference on Neural Networks
Correlation-based feature ranking for online classification
SMC'09 Proceedings of the 2009 IEEE international conference on Systems, Man and Cybernetics
A feature-based approach to modeling protein-DNA interactions
RECOMB'07 Proceedings of the 11th annual international conference on Research in computational molecular biology
Learning gene regulatory networks via globally regularized risk minimization
RECOMB-CG'07 Proceedings of the 2007 international conference on Comparative genomics
Grafting-light: fast, incremental feature selection and structure learning of Markov random fields
Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
Practical very large scale CRFs
ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
Feature selection for fluency ranking
INLG '10 Proceedings of the 6th International Natural Language Generation Conference
Kernel slicing: scalable online training with conjunctive features
COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics
Concensus of self-features for nonverbal behavior analysis
HBU'10 Proceedings of the First international conference on Human behavior understanding
Part-based feature synthesis for human detection
ECCV'10 Proceedings of the 11th European conference on Computer vision: Part IV
The Journal of Machine Learning Research
A game theoretic approach for feature clustering and its application to feature selection
PAKDD'11 Proceedings of the 15th Pacific-Asia conference on Advances in knowledge discovery and data mining - Volume Part I
Fast coordinate descent methods with variable selection for non-negative matrix factorization
Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
An excellent feature selection model using gradient-based and point injection techniques
ICONIP'06 Proceedings of the 13th international conference on Neural Information Processing - Volume Part II
Feature selection based on kernel discriminant analysis
ICANN'06 Proceedings of the 16th international conference on Artificial Neural Networks - Volume Part II
Evaluating feature selection for SVMs in high dimensions
ECML'06 Proceedings of the 17th European conference on Machine Learning
Evaluation of feature selection by multiclass kernel discriminant analysis
ANNPR'10 Proceedings of the 4th IAPR TC3 conference on Artificial Neural Networks in Pattern Recognition
Embedded feature selection for support vector machines: state-of-the-art and future challenges
CIARP'11 Proceedings of the 16th Iberoamerican Congress conference on Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications
Feature selection for dimensionality reduction
SLSFS'05 Proceedings of the 2005 international conference on Subspace, Latent Structure and Feature Selection
Discriminative features in reversible stochastic attribute-value grammars
UCNLG+EVAL '11 Proceedings of the UCNLG+Eval: Language Generation and Evaluation Workshop
Journal of Computational Neuroscience
A variance reduction framework for stable feature selection
Statistical Analysis and Data Mining
ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers - Volume 1
Accelerated large scale optimization by concomitant hashing
ECCV'12 Proceedings of the 12th European conference on Computer Vision - Volume Part I
Large-scale multilabel propagation based on efficient sparse graph construction
ACM Transactions on Multimedia Computing, Communications, and Applications (TOMCCAP)
Robust feature selection based on regularized brownboost loss
Knowledge-Based Systems
Hi-index | 0.00 |
We present a novel and flexible approach to the problem of feature selection, called grafting. Rather than considering feature selection as separate from learning, grafting treats the selection of suitable features as an integral part of learning a predictor in a regularized learning framework. To make this regularized learning process sufficiently fast for large scale problems, grafting operates in an incremental iterative fashion, gradually building up a feature set while training a predictor model using gradient descent. At each iteration, a fast gradient-based heuristic is used to quickly assess which feature is most likely to improve the existing model, that feature is then added to the model, and the model is incrementally optimized using gradient descent. The algorithm scales linearly with the number of data points and at most quadratically with the number of features. Grafting can be used with a variety of predictor model classes, both linear and non-linear, and can be used for both classification and regression. Experiments are reported here on a variant of grafting for classification, using both linear and non-linear models, and using a logistic regression-inspired loss function. Results on a variety of synthetic and real world data sets are presented. Finally the relationship between grafting, stagewise additive modelling, and boosting is explored.