Multi-channel opportunistic access: a case of restless bandits with multiple plays

Authors:
Sahand Haji Ali Ahmad;Mingyan Liu
Affiliations:
Dept. of Electrical Engineering and Computer Science, University of Michigan, Ann Arbor, MI;Dept. of Electrical Engineering and Computer Science, University of Michigan, Ann Arbor, MI
Venue:
Allerton'09 Proceedings of the 47th annual Allerton conference on Communication, control, and computing
Year:
2009

Citing 8
Cited 3

Stochastic systems: estimation, identification and adaptive control

Stochastic systems: estimation, identification and adaptive control
On the average cost optimality equation and the structure of optimal policies for partially observable Markov decision processes

Annals of Operations Research
The Complexity of Optimal Queuing Network Control

Mathematics of Operations Research
Markov Decision Processes: Discrete Stochastic Dynamic Programming

Markov Decision Processes: Discrete Stochastic Dynamic Programming
Approximation Algorithms for Partial-Information Based Stochastic Control with Markovian Rewards

FOCS '07 Proceedings of the 48th Annual IEEE Symposium on Foundations of Computer Science
Approximation algorithms for restless bandit problems

SODA '09 Proceedings of the twentieth Annual ACM-SIAM Symposium on Discrete Algorithms
Optimality of myopic sensing in multichannel opportunistic access

IEEE Transactions on Information Theory
On myopic sensing for multi-channel opportunistic access: structure, optimality, and performance

IEEE Transactions on Wireless Communications - Part 2

Indexability of restless bandit problems and optimality of Whittle index for dynamic multichannel access

IEEE Transactions on Information Theory
Distributed learning approach for channel selection in cognitive radio networks

Proceedings of the Nineteenth International Workshop on Quality of Service
Exploiting channel memory for multiuser wireless scheduling without channel measurement: Capacity regions and algorithms

Performance Evaluation

Quantified Score

Hi-index	0.06

Visualization

Abstract

This paper considers the following stochastic control problem that arises in opportunistic spectrum access: a system consists of n channels where the state ("good" or "bad") of each channel evolves as independent and identically distributed Markov processes. A user can select exactly k channels to sense and access (based on the sensing result) in each time slot. A reward is obtained whenever the user senses and accesses a "good" channel. The objective is to design a channel selection policy that maximizes the expected discounted total reward accrued over a finite or infinite horizon. In our previous work we established the optimality of a greedy policy for the special case of k = 1 (i.e., single channel access) under the condition that the channel state transitions are positively correlated over time. In this paper we show under the same condition the greedy policy is optimal for the general case of k ≥ 1; the methodology introduced here is thus more general. This problem may be viewed as a special case of the restless bandit problem, with multiple plays. We discuss connections between the current problem and existing literature on this class of problems.