The Strength of Weak Learnability
Machine Learning
The nature of statistical learning theory
The nature of statistical learning theory
Boosting a weak learning algorithm by majority
Information and Computation
An introduction to support Vector Machines: and other kernel-based learning methods
An introduction to support Vector Machines: and other kernel-based learning methods
Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond
Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond
Convex Optimization
Boosting as a Regularized Path to a Maximum Margin Classifier
The Journal of Machine Learning Research
Approximation lasso methods for language modeling
ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
On Model Selection Consistency of Lasso
The Journal of Machine Learning Research
Sequential greedy approximation for certain convex optimization problems
IEEE Transactions on Information Theory
Stable recovery of sparse overcomplete representations in the presence of noise
IEEE Transactions on Information Theory
Just relax: convex programming methods for identifying sparse signals in noise
IEEE Transactions on Information Theory
Shrinkage and model selection with correlated variables via weighted fusion
Computational Statistics & Data Analysis
Adaptive Contextual Energy Parameterization for Automated Image Segmentation
ISVC '09 Proceedings of the 5th International Symposium on Advances in Visual Computing: Part I
Adaptive regularization for image segmentation using local image curvature cues
ECCV'10 Proceedings of the 11th European conference on Computer vision: Part IV
Expert Systems with Applications: An International Journal
The Journal of Machine Learning Research
Interpretable visual models for human perception-based object retrieval
Proceedings of the 1st ACM International Conference on Multimedia Retrieval
Variational approximation for heteroscedastic linear models and matching pursuit algorithms
Statistics and Computing
Functional gradient ascent for Probit regression
Pattern Recognition
Regularization path for linear model via net method
ICSI'12 Proceedings of the Third international conference on Advances in Swarm Intelligence - Volume Part II
Technical Section: Interactive high fidelity visualization of complex materials on the GPU
Computers and Graphics
Hi-index | 0.00 |
Many statistical machine learning algorithms minimize either an empirical loss function as in AdaBoost, or a penalized empirical loss as in Lasso or SVM. A single regularization tuning parameter controls the trade-off between fidelity to the data and generalizability, or equivalently between bias and variance. When this tuning parameter changes, a regularization "path" of solutions to the minimization problem is generated, and the whole path is needed to select a tuning parameter to optimize the prediction or interpretation performance. Algorithms such as homotopy-Lasso or LARS-Lasso and Forward Stagewise Fitting (FSF) (aka e-Boosting) are of great interest because of their resulted sparse models for interpretation in addition to prediction. In this paper, we propose the BLasso algorithm that ties the FSF (e-Boosting) algorithm with the Lasso method that minimizes the L1 penalized L2 loss. BLasso is derived as a coordinate descent method with a fixed stepsize applied to the general Lasso loss function (L1 penalized convex loss). It consists of both a forward step and a backward step. The forward step is similar to e-Boosting or FSF, but the backward step is new and revises the FSF (or e-Boosting) path to approximate the Lasso path. In the cases of a finite number of base learners and a bounded Hessian of the loss function, the BLasso path is shown to converge to the Lasso path when the stepsize goes to zero. For cases with a larger number of base learners than the sample size and when the true model is sparse, our simulations indicate that the BLasso model estimates are sparser than those from FSF with comparable or slightly better prediction performance, and that the the discrete stepsize of BLasso and FSF has an additional regularization effect in terms of prediction and sparsity. Moreover, we introduce the Generalized BLasso algorithm to minimize a general convex loss penalized by a general convex function. Since the (Generalized) BLasso relies only on differences not derivatives, we conclude that it provides a class of simple and easy-to-implement algorithms for tracing the regularization or solution paths of penalized minimization problems.