Sparse regression learning by aggregation and Langevin Monte-Carlo

Authors:
A. S. Dalalyan;A. B. Tsybakov
Affiliations:
IMAGINE, LIGM, Université Paris Est, Ecole des Ponts ParisTech, France;CREST and LPMA, Université Paris 6, France
Venue:
Journal of Computer and System Sciences
Year:
2012

Citing 18
Cited 3

Aggregating strategies

COLT '90 Proceedings of the third annual workshop on Computational learning theory
The weighted majority algorithm

Information and Computation
How to use expert advice

Journal of the ACM (JACM)
PAC-Bayesian Stochastic Model Selection

Machine Learning
Averaging Expert Predictions

EuroCOLT '99 Proceedings of the 4th European Conference on Computational Learning Theory
Prediction, Learning, and Games

Prediction, Learning, and Games
On Model Selection Consistency of Lasso

The Journal of Machine Learning Research
Aggregation by exponential weighting, sharp PAC-Bayesian bounds and sparsity

Machine Learning
Bayesian Inference and Optimal Design for the Sparse Linear Model

The Journal of Machine Learning Research
Introduction to Nonparametric Estimation

Introduction to Nonparametric Estimation
Aggregation by exponential weighting and sharp oracle inequalities

COLT'07 Proceedings of the 20th annual conference on Learning theory
Aggregation and sparsity via ℓ1 penalized least squares

COLT'06 Proceedings of the 19th annual conference on Learning Theory
An Empirical Bayesian Strategy for Solving the Simultaneous Sparse Approximation Problem

IEEE Transactions on Signal Processing - Part II
Sequential prediction of individual sequences under general loss functions

IEEE Transactions on Information Theory
On the generalization ability of on-line learning algorithms

IEEE Transactions on Information Theory
Stable recovery of sparse overcomplete representations in the presence of noise

IEEE Transactions on Information Theory
Information Theory and Mixing Least-Squares Regressions

IEEE Transactions on Information Theory
Near-Optimal Signal Recovery From Random Projections: Universal Encoding Strategies?

IEEE Transactions on Information Theory

Non-convex penalized estimation in high-dimensional models with single-index structure

Journal of Multivariate Analysis
Sparse single-index model

The Journal of Machine Learning Research
Sparsity regret bounds for individual sequences in online linear regression

The Journal of Machine Learning Research

Quantified Score

Hi-index	0.00

Visualization

Abstract

We consider the problem of regression learning for deterministic design and independent random errors. We start by proving a sharp PAC-Bayesian type bound for the exponentially weighted aggregate (EWA) under the expected squared empirical loss. For a broad class of noise distributions the presented bound is valid whenever the temperature parameter @b of the EWA is larger than or equal to 4@s^2, where @s^2 is the noise variance. A remarkable feature of this result is that it is valid even for unbounded regression functions and the choice of the temperature parameter depends exclusively on the noise level. Next, we apply this general bound to the problem of aggregating the elements of a finite-dimensional linear space spanned by a dictionary of functions @f"1,...,@f"M. We allow M to be much larger than the sample size n but we assume that the true regression function can be well approximated by a sparse linear combination of functions @f"j. Under this sparsity scenario, we propose an EWA with a heavy tailed prior and we show that it satisfies a sparsity oracle inequality with leading constant one. Finally, we propose several Langevin Monte-Carlo algorithms to approximately compute such an EWA when the number M of aggregated functions can be large. We discuss in some detail the convergence of these algorithms and present numerical experiments that confirm our theoretical findings.