Sequential Monte Carlo on large binary sampling spaces

Authors:
Christian Schäfer;Nicolas Chopin
Affiliations:
Centre de Recherche en Économie et Statistique, Malakoff, France 92240 and CEntre de REcherches en MAthématiques de la DEcision, Université Paris-Dauphine, Paris, France 75775;Centre de Recherche en Économie et Statistique, Malakoff, France 92240 and Ecole Nationale de la Statistique et de l'Administration, Malakoff, France 92240
Venue:
Statistics and Computing
Year:
2013

Citing 5
Cited 2

Annealed importance sampling

Statistics and Computing
Monte Carlo Statistical Methods (Springer Texts in Statistics)

Monte Carlo Statistical Methods (Springer Texts in Statistics)
An Introduction to Copulas (Springer Series in Statistics)

An Introduction to Copulas (Springer Series in Statistics)
A tutorial on adaptive MCMC

Statistics and Computing
Adaptive importance sampling in general mixture classes

Statistics and Computing

Sequential Monte Carlo EM for multivariate probit models

Computational Statistics & Data Analysis
Marginal reversible jump Markov chain Monte Carlo with application to motor unit number estimation

Computational Statistics & Data Analysis

Quantified Score

Hi-index	0.00

Visualization

Abstract

A Monte Carlo algorithm is said to be adaptive if it automatically calibrates its current proposal distribution using past simulations. The choice of the parametric family that defines the set of proposal distributions is critical for good performance. In this paper, we present such a parametric family for adaptive sampling on high dimensional binary spaces.A practical motivation for this problem is variable selection in a linear regression context. We want to sample from a Bayesian posterior distribution on the model space using an appropriate version of Sequential Monte Carlo.Raw versions of Sequential Monte Carlo are easily implemented using binary vectors with independent components. For high dimensional problems, however, these simple proposals do not yield satisfactory results. The key to an efficient adaptive algorithm are binary parametric families which take correlations into account, analogously to the multivariate normal distribution on continuous spaces.We provide a review of models for binary data and make one of them work in the context of Sequential Monte Carlo sampling. Computational studies on real life data with about a hundred covariates suggest that, on difficult instances, our Sequential Monte Carlo approach clearly outperforms standard techniques based on Markov chain exploration.