Elements of information theory
Elements of information theory
Atomic Decomposition by Basis Pursuit
SIAM Journal on Scientific Computing
A lower estimate for entropy numbers
Journal of Approximation Theory
Lectures on Discrete Geometry
Eigenvalues of large sample covariance matrices of spiked population models
Journal of Multivariate Analysis
On Model Selection Consistency of Lasso
The Journal of Machine Learning Research
IEEE Transactions on Information Theory
Decoding by linear programming
IEEE Transactions on Information Theory
Stable recovery of sparse overcomplete representations in the presence of noise
IEEE Transactions on Information Theory
Adaptive and optimal online linear regression on l1-balls
ALT'11 Proceedings of the 22nd international conference on Algorithmic learning theory
Hi-index | 0.00 |
Consider the standard linear regression model y = Xβ* + w, where y ∈ Rn is an observation vector, X ∈ Rn×d is a measurement matrix, β* ∈ Rd is the unknown regression vector, and w ∼ N(0, σ2I) is additive Gaussian noise. This paper determines sharp minimax rates of convergence for estimation of β* in l2 norm, assuming that β* belongs to a weak lq-ball Bq(Rq) for some q ∈ [0, 1]. We show that under suitable regularity conditions on the design matrix X, the minimax error in squared l2-norm scales as Rq (log d/n)1-q/2. In addition, we provide lower bounds on rates of convergence for general lp norm (for all p ∈ [1, + ∞], p ≠ q). Our proofs of the lower bounds are information-theoretic in nature, based on Fano's inequality and results on the metric entropy of the balls Bq(Rq). Matching upper bounds are derived by direct analysis of the solution to an optimization algorithm over Bq(Rq). We prove that the conditions on X required by optimal algorithms are satisfied with high probability by broad classes of noni.i.d. Gaussian random matrices, for which RIP or other sparse eigenvalue conditions are violated. For q = 0, l1- based methods (Lasso and Dantzig selector) achieve the minimax optimal rates in l2 error, but require stronger regularity conditions on the design than the nonconvex optimization algorithm used to determine the minimax upper bounds.