Consistency of Sequential Bayesian Sampling Policies

Authors:
Peter I. Frazier;Warren B. Powell
Affiliations:
pf98@cornell.edu;powell@princeton.edu
Venue:
SIAM Journal on Control and Optimization
Year:
2011

Citing 13
Cited 2

Control of selective perception using Bayes nets and decision theory

International Journal of Computer Vision - Special issue on active vision II
A gradient approach for smartly allocating computing budget for discrete event simulation

WSC '96 Proceedings of the 28th conference on Winter simulation
On the convergence of the P-algorithm for one-dimensional global optimization of smooth functions

Journal of Optimization Theory and Applications
Neuro-Dynamic Programming

Neuro-Dynamic Programming
Bayesian Algorithms for One-Dimensional GlobalOptimization

Journal of Global Optimization
New Two-Stage and Sequential Procedures for Selecting the Best Simulated System

Operations Research
Adaptive tree search

Adaptive tree search
A large deviations perspective on ordinal optimization

WSC '04 Proceedings of the 36th conference on Winter simulation
Optimal Sequential Exploration: A Binary Learning Model

Decision Analysis
A Knowledge-Gradient Policy for Sequential Information Collection

SIAM Journal on Control and Optimization
The knowledge-gradient stopping rule for ranking and selection

Proceedings of the 40th Conference on Winter Simulation
Learning and classifying under hard budgets

ECML'05 Proceedings of the 16th European conference on Machine Learning
Opportunity Cost and OCBA Selection Procedures in Ordinal Optimization for a Fixed Number of Alternative Systems

IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews

Information Collection on a Graph

Operations Research
Stochastic resource allocation using a predictor-based heuristic for optimization via simulation

Computers and Operations Research

Quantified Score

Hi-index	0.00

Visualization

Abstract

We consider Bayesian information collection, in which a measurement policy collects information to support a future decision. This framework includes ranking and selection, continuous global optimization, and many other problems in sequential experimental design. We give a sufficient condition under which measurement policies sample each measurement type infinitely often, ensuring consistency, i.e., that a globally optimal future decision is found in the limit. This condition is useful for verifying consistency of adaptive sequential sampling policies that do not do forced random exploration, making consistency difficult to verify by other means. We demonstrate the use of this sufficient condition by showing consistency of two previously proposed ranking and selection policies: optimal computing budget allocation (OCBA) for linear loss, and the knowledge-gradient policy with independent normal priors. Consistency of the knowledge-gradient policy was shown previously, while the consistency result for OCBA is new.