Exponentiated gradient versus gradient descent for linear predictors
Information and Computation
Primal-dual interior-point methods
Primal-dual interior-point methods
Bayesian parameter estimation via variational methods
Statistics and Computing
Convex Optimization
Feature selection, L1 vs. L2 regularization, and rotational invariance
ICML '04 Proceedings of the twenty-first international conference on Machine learning
Sparse Multinomial Logistic Regression: Fast Algorithms and Generalization Bounds
IEEE Transactions on Pattern Analysis and Machine Intelligence
Multiplicative Updates for Nonnegative Quadratic Programming
Neural Computation
An Interior-Point Method for Large-Scale l1-Regularized Logistic Regression
The Journal of Machine Learning Research
EfficientL1regularized logistic regression
AAAI'06 Proceedings of the 21st national conference on Artificial intelligence - Volume 1
Beyond blacklists: learning to detect malicious web sites from suspicious URLs
Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Hi-index | 0.00 |
Multiplicative update rules have proven useful in many areas of machine learning. Simple to implement, guaranteed to converge, they account in part for the widespread popularity of algorithms such as nonnegative matrix factorization and Expectation-Maximization. In this paper, we show how to derive multiplicative updates for problems in L1-regularized linear and logistic regression. For L1-regularized linear regression, the updates are derived by reformulating the required optimization as a problem in nonnegative quadratic programming (NQP). The dual of this problem, itself an instance of NQP, can also be solved using multiplicative updates; moreover, the observed duality gap can be used to bound the error of intermediate solutions. For L1-regularized logistic regression, we derive similar updates using an iteratively reweighted least squares approach. We present illustrative experimental results and describe efficient implementations for large-scale problems of interest (e.g., with tens of thousands of examples and over one million features).