On Bayesian classification with Laplace priors

Authors:
Ata Kabán
Affiliations:
School of Computer Science, The University of Birmingham, Birmingham B15 2TT, UK
Venue:
Pattern Recognition Letters
Year:
2007

Citing 10
Cited 6

Regularization with a pruning prior

Neural Networks
Gene Selection for Cancer Classification using Support Vector Machines

Machine Learning
Variational Relevance Vector Machines

UAI '00 Proceedings of the 16th Conference on Uncertainty in Artificial Intelligence
Adaptive Sparseness for Supervised Learning

IEEE Transactions on Pattern Analysis and Machine Intelligence
Sparse bayesian learning and the relevance vector machine

The Journal of Machine Learning Research
Information Theory, Inference & Learning Algorithms

Information Theory, Inference & Learning Algorithms
Predictive automatic relevance determination by expectation propagation

ICML '04 Proceedings of the twenty-first international conference on Machine learning
Bayesian model averaging: development of an improved multi-class, gene selection and classification tool for microarray data

Bioinformatics
Analysis of multiresolution image denoising schemes using generalized Gaussian and complexity priors

IEEE Transactions on Information Theory
The generalized LASSO

IEEE Transactions on Neural Networks

Laplace maximum margin Markov networks

Proceedings of the 25th international conference on Machine learning
StatSnowball: a statistical approach to extracting entity relationships

Proceedings of the 18th international conference on World wide web
Primal sparse Max-margin Markov networks

Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Maximum Entropy Discrimination Markov Networks

The Journal of Machine Learning Research
Probit classifiers with a generalized Gaussian scale mixture prior

IJCAI'11 Proceedings of the Twenty-Second international joint conference on Artificial Intelligence - Volume Volume Two
Probabilistic classifiers with a generalized Gaussian scale mixture prior

Pattern Recognition

Quantified Score

Hi-index	0.10

Visualization

Abstract

We present a new classification approach, using a variational Bayesian estimation of probit regression with Laplace priors. Laplace priors have been previously used extensively as a sparsity-inducing mechanism to perform feature selection simultaneously with classification or regression. However, contrarily to the 'myth' of sparse Bayesian learning with Laplace priors, we find that the sparsity effect is due to a property of the maximum a posteriori (MAP) parameter estimates only. The Bayesian estimates, in turn, induce a posterior weighting rather than a hard selection of features, and has different advantageous properties: (1) It provides better estimates of the prediction uncertainty; (2) it is able to retain correlated features favouring generalisation; (3) it is more stable with respect to the hyperparameter choice and (4) it produces a weight-based ranking of the features, suited for interpretation. We analyse the behaviour of the Bayesian estimate in comparison with its MAP counterpart, as well as other related models, (a) through a graphical interpretation of the associated shrinkage and (b) by controlled numerical simulations in a range of testing conditions. The results pinpoint the situations when the advantages of Bayesian estimates are feasible to exploit. Finally, we demonstrate the working of our method in a gene expression classification task.