COLT '90 Proceedings of the third annual workshop on Computational learning theory
The weighted majority algorithm
Information and Computation
Journal of the ACM (JACM)
PAC-Bayesian Stochastic Model Selection
Machine Learning
EuroCOLT '99 Proceedings of the 4th European Conference on Computational Learning Theory
Prediction, Learning, and Games
Prediction, Learning, and Games
On Model Selection Consistency of Lasso
The Journal of Machine Learning Research
Bayesian Inference and Optimal Design for the Sparse Linear Model
The Journal of Machine Learning Research
Introduction to Nonparametric Estimation
Introduction to Nonparametric Estimation
Aggregation by exponential weighting and sharp oracle inequalities
COLT'07 Proceedings of the 20th annual conference on Learning theory
Aggregation and sparsity via ℓ1 penalized least squares
COLT'06 Proceedings of the 19th annual conference on Learning Theory
An Empirical Bayesian Strategy for Solving the Simultaneous Sparse Approximation Problem
IEEE Transactions on Signal Processing - Part II
Sequential prediction of individual sequences under general loss functions
IEEE Transactions on Information Theory
On the generalization ability of on-line learning algorithms
IEEE Transactions on Information Theory
Stable recovery of sparse overcomplete representations in the presence of noise
IEEE Transactions on Information Theory
Information Theory and Mixing Least-Squares Regressions
IEEE Transactions on Information Theory
Near-Optimal Signal Recovery From Random Projections: Universal Encoding Strategies?
IEEE Transactions on Information Theory
Non-convex penalized estimation in high-dimensional models with single-index structure
Journal of Multivariate Analysis
The Journal of Machine Learning Research
Sparsity regret bounds for individual sequences in online linear regression
The Journal of Machine Learning Research
Hi-index | 0.00 |
We consider the problem of regression learning for deterministic design and independent random errors. We start by proving a sharp PAC-Bayesian type bound for the exponentially weighted aggregate (EWA) under the expected squared empirical loss. For a broad class of noise distributions the presented bound is valid whenever the temperature parameter @b of the EWA is larger than or equal to 4@s^2, where @s^2 is the noise variance. A remarkable feature of this result is that it is valid even for unbounded regression functions and the choice of the temperature parameter depends exclusively on the noise level. Next, we apply this general bound to the problem of aggregating the elements of a finite-dimensional linear space spanned by a dictionary of functions @f"1,...,@f"M. We allow M to be much larger than the sample size n but we assume that the true regression function can be well approximated by a sparse linear combination of functions @f"j. Under this sparsity scenario, we propose an EWA with a heavy tailed prior and we show that it satisfies a sparsity oracle inequality with leading constant one. Finally, we propose several Langevin Monte-Carlo algorithms to approximately compute such an EWA when the number M of aggregated functions can be large. We discuss in some detail the convergence of these algorithms and present numerical experiments that confirm our theoretical findings.