Numerical linear algebra algorithms and software
Journal of Computational and Applied Mathematics - Special issue on numerical analysis 2000 Vol. III: linear algebra
Markov Decision Processes: Discrete Stochastic Dynamic Programming
Markov Decision Processes: Discrete Stochastic Dynamic Programming
Marginal productivity index policies for scheduling a multiclass delay-/loss-sensitive queue
Queueing Systems: Theory and Applications
Mathematics of Operations Research
Computing an index policy for bandits with switching penalties
Proceedings of the 2nd international conference on Performance evaluation methodologies and tools
A (2/3)n3 Fast-Pivoting Algorithm for the Gittins Index and Optimal Stopping of a Markov Chain
INFORMS Journal on Computing
A Faster Index Algorithm and a Computational Study for Bandits with Switching Costs
INFORMS Journal on Computing
Computing an index policy for bandits with switching penalties
Proceedings of the 2nd international conference on Performance evaluation methodologies and tools
Computing an index policy for multiarmed bandits with deadlines
Proceedings of the 3rd International Conference on Performance Evaluation Methodologies and Tools
Proceedings of the 3rd International Conference on Performance Evaluation Methodologies and Tools
A modeling framework for optimizing the flow-level scheduling with time-varying channels
Performance Evaluation
Hi-index | 0.00 |
The restless bandit problem furnishes a powerful modeling paradigm for settings involving the optimal dynamic priority allocation to multiple stochatic projects, given as binary-action (active/passive) Markov decision processes (MDPs). Though generally intractable, Whittle (1988) introduced a tractable priority-index policy, which has been developed in recent work by the author in an extended framework based on the unifying concept of marginal productivity index (MPI). A growing body of empirical evidence shows MPI policies to be nearly optimal in diverse applications, which motivates the interest to compute efficiently the MPI. For such a purpose, we extend to restless bandits the parametric linear programming (LP) approach deployed in [J. Niño-Mora. A (2/3)n3 fast-pivoting algorithm for the Gittins index and optimal stopping of a Markov chain, INFORMS J. Comp., in press] for classic (nonrestless) bandits. Yet the extension is not straightforward, as the MPI is only defined for the restricted range of so-called indexable bandits, which motivates the quest for methods to establish indexability. This paper furnishes algorithmic and analytical tools to realize the potential of MPI policies in large-scale applications, presenting the following contributions: (i) an algorithmic characterization of indexability, for which two block implementations are given; and (ii) new analytical conditions for indexability --- termed LP-indexability --- that leverage knowledge on the structure of optimal policies, under which the MPI is computed faster by the adaptive-greedy algorithm previously introduced by the author under more stringent (PCL-indexability) conditions, for which a new fast-pivoting block implementation is given. The paper further reports on a computational study, which measures the runtime performance of the algorithms and demonstrates the high prevalence of indexability and PCL-indexability.