Latent Variable Bayesian Models for Promoting Sparsity

Authors:
D. P. Wipf;B. D. Rao;S. Nagarajan
Affiliations:
Visual Comput. Group, Microsoft Res. Asia, Beijing, China;-;-
Venue:
IEEE Transactions on Information Theory
Year:
2011

Citing 0
Cited 3

Large Scale Bayesian Inference and Experimental Design for Sparse Linear Models

SIAM Journal on Imaging Sciences
A Bayesian approach to sparse dynamic network identification

Automatica (Journal of IFAC)
Sparse/robust estimation and Kalman smoothing with nonsmooth log-concave densities: modeling, computation, and theory

The Journal of Machine Learning Research

Quantified Score

Hi-index	754.84

Visualization

Abstract

Many practical methods for finding maximally sparse coefficient expansions involve solving a regression problem using a particular class of concave penalty functions. From a Bayesian perspective, this process is equivalent to maximum a posteriori (MAP) estimation using a sparsity-inducing prior distribution (Type I estimation). Using variational techniques, this distribution can always be conveniently expressed as a maximization over scaled Gaussian distributions modulated by a set of latent variables. Alternative Bayesian algorithms, which operate in latent variable space leveraging this variational representation, lead to sparse estimators reflecting posterior information beyond the mode (Type II estimation). Currently, it is unclear how the underlying cost functions of Type I and Type II relate, nor what relevant theoretical properties exist, especially with regard to Type II. Herein a common set of auxiliary functions is used to conveniently express both Type I and Type II cost functions in either coefficient or latent variable space facilitating direct comparisons. In coefficient space, the analysis reveals that Type II is exactly equivalent to performing standard MAP estimation using a particular class of dictionary- and noise-dependent, nonfactorial coefficient priors. One prior (at least) from this class maintains several desirable advantages over all possible Type I methods and utilizes a novel, nonconvex approximation to the l0 norm with most, and in certain quantifiable conditions all, local minima smoothed away. Importantly, the global minimum is always left unaltered unlike standard l1-norm relaxations. This ensures that any appropriate descent method is guaranteed to locate the maximally sparse solution.