On balancing exploration vs. exploitation in a cognitive engine for multi-antenna systems

Authors:
Haris I. Volos;R. Michael Buehrer
Affiliations:
Virginia Polytechnic Institute and State University;Virginia Polytechnic Institute and State University
Venue:
GLOBECOM'09 Proceedings of the 28th IEEE conference on Global telecommunications
Year:
2009

Citing 8
Cited 2

Introduction to Reinforcement Learning

Introduction to Reinforcement Learning
Biologically inspired cognitive radio engine model utilizing distributed genetic algorithms for secure and robust wireless communications and networking

Biologically inspired cognitive radio engine model utilizing distributed genetic algorithms for secure and robust wireless communications and networking
Cognitive engine implementation for wireless multicarrier transceivers

Wireless Communications & Mobile Computing - Cognitive Radio, Software Defined Radio And Adaptive Wireless Systems
Approximate Dynamic Programming: Solving the Curses of Dimensionality (Wiley Series in Probability and Statistics)

Approximate Dynamic Programming: Solving the Curses of Dimensionality (Wiley Series in Probability and Statistics)
Application of artificial intelligence to wireless communications

Application of artificial intelligence to wireless communications
Cognitive radio adaptation using particle swarm optimization

Wireless Communications & Mobile Computing
Applications of Machine Learning to Cognitive Radio Networks

IEEE Wireless Communications
From theory to practice: an overview of MIMO space-time coded wireless systems

IEEE Journal on Selected Areas in Communications

Cognitive engine design for link adaptation: an application to multi-antenna systems

IEEE Transactions on Wireless Communications
Wireless distributed computing in cognitive radio networks

Ad Hoc Networks

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper, we define the problem of balancing exploration vs. exploitation in a cognitive engine controlled multi-antenna communication system in terms of the classical multiarmed bandit framework. We then employ the ε-greedy strategy and Gittins' indices methods for addressing the problem in a system with no prior information. Results show that the Gittins' indices assuming a normal reward process had the best overall performance compared to the Gittins' indices with a Bernoulli reward process and the ε-greedy strategy. The latter was found to be more consistent albeit inefficient for most of the cases except in the case of both a low number of trials and a low SNR in which it was found to have better performance than the other methods. Nevertheless, the Gittins' indices method should be generally preferred as it is more consistent than the ε-greedy strategy across different scenarios.