Minimax rates of convergence for high-dimensional regression under lq-ball sparsity

  • Authors:
  • Garvesh Raskutti;Martin J. Wainwright;Bin Yu

  • Affiliations:
  • Department of Statistics, UC Berkeley, Berkeley, CA;Department of Statistics and Department of EECS, UC Berkeley, Berkeley, CA;Department of Statistics and Department of EECS, UC Berkeley, Berkeley, CA

  • Venue:
  • Allerton'09 Proceedings of the 47th annual Allerton conference on Communication, control, and computing
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

Consider the standard linear regression model y = Xβ* + w, where y ∈ Rn is an observation vector, X ∈ Rn×d is a measurement matrix, β* ∈ Rd is the unknown regression vector, and w ∼ N(0, σ2I) is additive Gaussian noise. This paper determines sharp minimax rates of convergence for estimation of β* in l2 norm, assuming that β* belongs to a weak lq-ball Bq(Rq) for some q ∈ [0, 1]. We show that under suitable regularity conditions on the design matrix X, the minimax error in squared l2-norm scales as Rq (log d/n)1-q/2. In addition, we provide lower bounds on rates of convergence for general lp norm (for all p ∈ [1, + ∞], p ≠ q). Our proofs of the lower bounds are information-theoretic in nature, based on Fano's inequality and results on the metric entropy of the balls Bq(Rq). Matching upper bounds are derived by direct analysis of the solution to an optimization algorithm over Bq(Rq). We prove that the conditions on X required by optimal algorithms are satisfied with high probability by broad classes of noni.i.d. Gaussian random matrices, for which RIP or other sparse eigenvalue conditions are violated. For q = 0, l1- based methods (Lasso and Dantzig selector) achieve the minimax optimal rates in l2 error, but require stronger regularity conditions on the design than the nonconvex optimization algorithm used to determine the minimax upper bounds.