Characterization and computation of restless bandit marginal productivity indices

Authors:
José Niño-Mora
Affiliations:
Universidad Carlos III de Madrid, Getafe (Madrid), Spain
Venue:
Proceedings of the 2nd international conference on Performance evaluation methodologies and tools
Year:
2007

Citing 8
Cited 5

Numerical linear algebra algorithms and software

Journal of Computational and Applied Mathematics - Special issue on numerical analysis 2000 Vol. III: linear algebra
Markov Decision Processes: Discrete Stochastic Dynamic Programming

Markov Decision Processes: Discrete Stochastic Dynamic Programming
Marginal productivity index policies for scheduling a multiclass delay-/loss-sensitive queue

Queueing Systems: Theory and Applications
A stochastic control approach for scheduling multimedia transmissions over a polled multiaccess fading channel

Wireless Networks
Restless Bandit Marginal Productivity Indices, Diminishing Returns, and Optimal Control of Make-to-Order/Make-to-Stock M/G/1 Queues

Mathematics of Operations Research
Computing an index policy for bandits with switching penalties

Proceedings of the 2nd international conference on Performance evaluation methodologies and tools
A (2/3)n3 Fast-Pivoting Algorithm for the Gittins Index and Optimal Stopping of a Markov Chain

INFORMS Journal on Computing
A Faster Index Algorithm and a Computational Study for Bandits with Switching Costs

INFORMS Journal on Computing

Computing an index policy for bandits with switching penalties

Proceedings of the 2nd international conference on Performance evaluation methodologies and tools
Computing an index policy for multiarmed bandits with deadlines

Proceedings of the 3rd International Conference on Performance Evaluation Methodologies and Tools
Admission control and routing to parallel queues with delayed information via marginal productivity indices

Proceedings of the 3rd International Conference on Performance Evaluation Methodologies and Tools
A modeling framework for optimizing the flow-level scheduling with time-varying channels

Performance Evaluation
Value of information in optimal flow-level scheduling of users with Markovian time-varying channels

Performance Evaluation

Quantified Score

Hi-index	0.00

Visualization

Abstract

The restless bandit problem furnishes a powerful modeling paradigm for settings involving the optimal dynamic priority allocation to multiple stochatic projects, given as binary-action (active/passive) Markov decision processes (MDPs). Though generally intractable, Whittle (1988) introduced a tractable priority-index policy, which has been developed in recent work by the author in an extended framework based on the unifying concept of marginal productivity index (MPI). A growing body of empirical evidence shows MPI policies to be nearly optimal in diverse applications, which motivates the interest to compute efficiently the MPI. For such a purpose, we extend to restless bandits the parametric linear programming (LP) approach deployed in [J. Niño-Mora. A (2/3)n3 fast-pivoting algorithm for the Gittins index and optimal stopping of a Markov chain, INFORMS J. Comp., in press] for classic (nonrestless) bandits. Yet the extension is not straightforward, as the MPI is only defined for the restricted range of so-called indexable bandits, which motivates the quest for methods to establish indexability. This paper furnishes algorithmic and analytical tools to realize the potential of MPI policies in large-scale applications, presenting the following contributions: (i) an algorithmic characterization of indexability, for which two block implementations are given; and (ii) new analytical conditions for indexability --- termed LP-indexability --- that leverage knowledge on the structure of optimal policies, under which the MPI is computed faster by the adaptive-greedy algorithm previously introduced by the author under more stringent (PCL-indexability) conditions, for which a new fast-pivoting block implementation is given. The paper further reports on a computational study, which measures the runtime performance of the algorithms and demonstrates the high prevalence of indexability and PCL-indexability.