Sharp thresholds for high-dimensional and noisy sparsity recovery using l1-constrained quadratic programming (Lasso)

  • Authors:
  • Martin J. Wainwright

  • Affiliations:
  • Department of Statistics, and Department of Electrical Engineering and Computer Sciences, University of California, Berkeley, Berkeley, CA

  • Venue:
  • IEEE Transactions on Information Theory
  • Year:
  • 2009

Quantified Score

Hi-index 755.14

Visualization

Abstract

The problem of consistently estimating the sparsity pattern of a vector β* ∈ RP based on observations contaminated by noise arises in various contexts, including signal denoising, sparse approximation, compressed sensing, and model selection. We analyze the behavior of l1-constrained quadratic programming (QP), also referred to as the Lasso, for recovering the sparsity pattern. Our main result is to establish precise conditions on the problern dimension p, the number k of nonzero elements in β*, and the number of observations n that are necessary and sufficient for sparsity pattern recovery using the Lasso. We first analyze the case of observations made using deterministic design matrices and sub-Gaussian additive noise, and provide sufficient conditions for support recovery and l∞-error bounds, as well as results showing the necessity of incoherence and bounds on the minimum value. We then turn to the case of random designs, in which each row of the design is drawn from a N(0, Σ) ensemble. For a broad class of Gaussian ensembles satisfying mutual incoherence conditions, we compute explicit values of thresholds 0 l(Σ) ≤ θu(Σ) 0, if n 2(θu + δ)k log(p - k), then the Lasso succeeds in recovering the sparsity pattern with probability converging to one for large problems, whereas for n l - δ)k log(p - k), then the probability of successful recovery converges to zero. For the special case of the uniform Gaussian ensemble (Σ = I p×p), we show that θl = θu = 1, so that the precise threshold n = 2k log (p - k) is exactly determined.