Adaptive and optimal online linear regression on ℓ1-balls

Authors:
Sébastien Gerchinovitz;Jia Yuan Yu
Affiliations:
ícole Normale Supérieure, 45 rue dUlm, 75005 Paris, France;IBM Research, Damastown Technology Campus, Dublin 15, Ireland
Venue:
Theoretical Computer Science
Year:
2014

Citing 11
Cited 0

Exponentiated gradient versus gradient descent for linear predictors

Information and Computation
Analysis of two gradient-based algorithms for on-line regression

Journal of Computer and System Sciences
Relative Loss Bounds for On-Line Density Estimation with the Exponential Family of Distributions

Machine Learning
The Robustness of the p-Norm Algorithms

Machine Learning
Prediction, Learning, and Games

Prediction, Learning, and Games
Improved second-order bounds for prediction with expert advice

Machine Learning
Stochastic methods for l1 regularized loss minimization

ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Trading Accuracy for Sparsity in Optimization Problems with Sparsity Constraints

SIAM Journal on Optimization
Minimizing regret with label efficient prediction

IEEE Transactions on Information Theory
Sequential Procedures for Aggregating Arbitrary Estimators of a Conditional Mean

IEEE Transactions on Information Theory
Minimax Rates of Estimation for High-Dimensional Linear Regression Over $\ell_q$ -Balls

IEEE Transactions on Information Theory

Quantified Score

Hi-index	5.23

Visualization

Abstract

We consider the problem of online linear regression on individual sequences. The goal in this paper is for the forecaster to output sequential predictions which are, after T time rounds, almost as good as the ones output by the best linear predictor in a given @?^1-ball in R^d. We consider both the cases where the dimension d is small and large relative to the time horizon T. We first present regret bounds with optimal dependencies on d, T, and on the sizes U, X and Y of the @?^1-ball, the input data and the observations. The minimax regret is shown to exhibit a regime transition around the point d=TUX/(2Y). Furthermore, we present efficient algorithms that are adaptive, i.e., that do not require the knowledge of U, X, Y, and T, but still achieve nearly optimal regret bounds.