Indexability of restless bandit problems and optimality of Whittle index for dynamic multichannel access

Authors:
Keqin Liu;Qing Zhao
Affiliations:
Department of Electrical and Computer Engineering, University of California at Davis, Davis, CA;Department of Electrical and Computer Engineering, University of California at Davis, Davis, CA
Venue:
IEEE Transactions on Information Theory
Year:
2010

Citing 12
Cited 6

Discrete-time controlled Markov processes with average cost criterion: a survey

SIAM Journal on Control and Optimization
The Complexity of Optimal Queuing Network Control

Mathematics of Operations Research
ON THE OPTIMALITY OF AN INDEX RULE IN MULTICHANNEL ALLOCATION FOR SINGLE-HOP MOBILE NETWORKS WITH MULTIPLE SERVICE CLASSES

Probability in the Engineering and Informational Sciences
Approximation Algorithms for Partial-Information Based Stochastic Control with Markovian Rewards

FOCS '07 Proceedings of the 48th Annual IEEE Symposium on Foundations of Computer Science
Approximation algorithms for restless bandit problems

SODA '09 Proceedings of the twentieth Annual ACM-SIAM Symposium on Discrete Algorithms
Optimality of myopic sensing in multichannel opportunistic access

IEEE Transactions on Information Theory
Multi-channel opportunistic access: a case of restless bandits with multiple plays

Allerton'09 Proceedings of the 47th annual Allerton conference on Communication, control, and computing
Dynamic multichannel access with imperfect channel state detection

IEEE Transactions on Signal Processing
Opportunistic file transfer over a fading channel: A POMDP search theory formulation with optimal threshold policies

IEEE Transactions on Wireless Communications
On myopic sensing for multi-channel opportunistic access: structure, optimality, and performance

IEEE Transactions on Wireless Communications - Part 2
Joint Design and Separation Principle for Opportunistic Spectrum Access in the Presence of Sensing Errors

IEEE Transactions on Information Theory
Decentralized cognitive MAC for opportunistic spectrum access in ad hoc networks: A POMDP framework

IEEE Journal on Selected Areas in Communications

Restless watchdog: selective quickest spectrum sensing in multichannel cognitive radio systems

EURASIP Journal on Advances in Signal Processing - Special issue on dynamic spectrum access for wireless networking
Quickest detection in multiple on-off processes

IEEE Transactions on Signal Processing
Exploiting channel memory for multiuser wireless scheduling without channel measurement: Capacity regions and algorithms

Performance Evaluation
Reinforcement learning based sensing policy optimization for energy efficient cognitive radio networks

Neurocomputing
Sequential opportunistic spectrum access with imperfect channel sensing

Ad Hoc Networks
Green Access Point Selection for Wireless Local Area Networks Enhanced by Cognitive Radio

Mobile Networks and Applications

Quantified Score

Hi-index	754.84

Visualization

Abstract

In this paper, we consider a class of restless multiarmed bandit processes (RMABs) that arises in dynamic multichannel access, user/server scheduling, and optimal activation in multiagent systems. For this class of RMABs, we establish the indexability and obtain Whittle index in closed form for both discounted and average reward criteria. These results lead to a direct implementation of Whittle index policy with remarkably low complexity. When arms are stochastically identical, we show that Whittle index policy is optimal under certain conditions. Furthermore, it has a semiuniversal structure that obviates the need to know the Markov transition probabilities. The optimality and the semiuniversal structure result from the equivalence between Whittle index policy and the myopic policy established in this work. For nonidentical arms, we develop efficient algorithms for computing a performance upper bound given by Lagrangian relaxation. The tightness of the upper bound and the near-optimal performance of Whittle index policy are illustrated with simulation examples.