Restless Bandits, Linear Programming Relaxations, and a Primal-Dual Index Heuristic

  • Authors:
  • Dimitris Bertsimas;José Niño-Mora

  • Affiliations:
  • -;-

  • Venue:
  • Operations Research
  • Year:
  • 2000

Quantified Score

Hi-index 0.06

Visualization

Abstract

We develop a mathematical programming approach for the classicalPSPACE-hard restless bandit problem in stochastic optimization. We introduce a hierarchy ofN (whereN is the number of bandits) increasingly stronger linear programming relaxations, the last of which is exact and corresponds to the (exponential size) formulation of the problem as a Markov decision chain, while the other relaxations provide bounds and are efficiently computed. We also propose a priority-index heuristic scheduling policy from the solution to the firstorder relaxation, where the indices are defined in terms of optimal dual variables. In this way we propose a policy and a suboptimality guarantee. We report results of computational experiments that suggest that the proposed heuristic policy is nearly optimal. Moreover, the second-order relaxation is found to provide strong bounds on the optimal value.