Balancing exploration and exploitation ratio in reinforcement learning

  • Authors:
  • Ozkan Ozcan;Claudio Coreixas de Moraes;Jonathan Alt

  • Affiliations:
  • MOVES Institute, Naval Postgraduate School, Monterey, California;MOVES Institute, Naval Postgraduate School, Monterey, California;MOVES Institute, Naval Postgraduate School, Monterey, California

  • Venue:
  • Proceedings of the 2011 Military Modeling & Simulation Symposium
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

The issue of controlling the ratio of exploration and exploitation in agent learning in dynamic environments provides a continuing challenge in the application of agent learning techniques. Methods to control this ratio in a manner that mimics human behavior are required for use in the representation of human behavior, which seek to constrain agent learning mechanisms in a manner similar to that observed in human cognition. This paper describes the use of two novel methods for adjusting the exploration and exploitation ratio of agents using a simple grid-world example and two armed bandit example.