Discrete Hit-and-Run for Sampling Points from Arbitrary Distributions Over Subsets of Integer Hyperrectangles

  • Authors:
  • Stephen Baumert;Archis Ghate;Seksan Kiatsupaibul;Yanfang Shen;Robert L. Smith;Zelda B. Zabinsky

  • Affiliations:
  • U.S. Postal Service, Office of Inspector General, Arlington, Virginia 22209;Department of Industrial and Systems Engineering, University of Washington, Seattle, Washington 98195;Department of Statistics, Chulalongkorn University, Bangkok 10330, Thailand;Clearsight Systems, Inc., Seattle, Washington 98104;Department of Industrial and Operations Engineering, University of Michigan, Ann Arbor, Michigan 48109;Department of Industrial and Systems Engineering, University of Washington, Seattle, Washington 98195

  • Venue:
  • Operations Research
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

We consider the problem of sampling a point from an arbitrary distribution π over an arbitrary subset S of an integer hyperrectangle. Neither the distribution π nor the support set S are assumed to be available as explicit mathematical equations, but may only be defined through oracles and, in particular, computer programs. This problem commonly occurs in black-box discrete optimization as well as counting and estimation problems. The generality of this setting and high dimensionality of S precludes the application of conventional random variable generation methods. As a result, we turn to Markov chain Monte Carlo (MCMC) sampling, where we execute an ergodic Markov chain that converges to π so that the distribution of the point delivered after sufficiently many steps can be made arbitrarily close to π. Unfortunately, classical Markov chains, such as the nearest-neighbor random walk or the coordinate direction random walk, fail to converge to π because they can get trapped in isolated regions of the support set. To surmount this difficulty, we propose discrete hit-and-run (DHR), a Markov chain motivated by the hit-and-run algorithm known to be the most efficient method for sampling from log-concave distributions over convex bodies in Rn. We prove that the limiting distribution of DHR is π as desired, thus enabling us to sample approximately from π by delivering the last iterate of a sufficiently large number of iterations of DHR. In addition to this asymptotic analysis, we investigate finite-time behavior of DHR and present a variety of examples where DHR exhibits polynomial performance.