Multi-channel opportunistic access: a case of restless bandits with multiple plays

  • Authors:
  • Sahand Haji Ali Ahmad;Mingyan Liu

  • Affiliations:
  • Dept. of Electrical Engineering and Computer Science, University of Michigan, Ann Arbor, MI;Dept. of Electrical Engineering and Computer Science, University of Michigan, Ann Arbor, MI

  • Venue:
  • Allerton'09 Proceedings of the 47th annual Allerton conference on Communication, control, and computing
  • Year:
  • 2009

Quantified Score

Hi-index 0.06

Visualization

Abstract

This paper considers the following stochastic control problem that arises in opportunistic spectrum access: a system consists of n channels where the state ("good" or "bad") of each channel evolves as independent and identically distributed Markov processes. A user can select exactly k channels to sense and access (based on the sensing result) in each time slot. A reward is obtained whenever the user senses and accesses a "good" channel. The objective is to design a channel selection policy that maximizes the expected discounted total reward accrued over a finite or infinite horizon. In our previous work we established the optimality of a greedy policy for the special case of k = 1 (i.e., single channel access) under the condition that the channel state transitions are positively correlated over time. In this paper we show under the same condition the greedy policy is optimal for the general case of k ≥ 1; the methodology introduced here is thus more general. This problem may be viewed as a special case of the restless bandit problem, with multiple plays. We discuss connections between the current problem and existing literature on this class of problems.