Exponentiated gradient versus gradient descent for linear predictors
Information and Computation
Analysis of two gradient-based algorithms for on-line regression
Journal of Computer and System Sciences
The Robustness of the p-Norm Algorithms
Machine Learning
Prediction, Learning, and Games
Prediction, Learning, and Games
Improved second-order bounds for prediction with expert advice
Machine Learning
Stochastic methods for l1 regularized loss minimization
ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Trading Accuracy for Sparsity in Optimization Problems with Sparsity Constraints
SIAM Journal on Optimization
Minimizing regret with label efficient prediction
IEEE Transactions on Information Theory
Sequential Procedures for Aggregating Arbitrary Estimators of a Conditional Mean
IEEE Transactions on Information Theory
Minimax Rates of Estimation for High-Dimensional Linear Regression Over $\ell_q$ -Balls
IEEE Transactions on Information Theory
Hi-index | 5.23 |
We consider the problem of online linear regression on individual sequences. The goal in this paper is for the forecaster to output sequential predictions which are, after T time rounds, almost as good as the ones output by the best linear predictor in a given @?^1-ball in R^d. We consider both the cases where the dimension d is small and large relative to the time horizon T. We first present regret bounds with optimal dependencies on d, T, and on the sizes U, X and Y of the @?^1-ball, the input data and the observations. The minimax regret is shown to exhibit a regime transition around the point d=TUX/(2Y). Furthermore, we present efficient algorithms that are adaptive, i.e., that do not require the knowledge of U, X, Y, and T, but still achieve nearly optimal regret bounds.