Robust high performance reinforcement learning through weighted k-nearest neighbors

  • Authors:
  • José Antonio Martín H;Javier de Lope;Darío Maravall

  • Affiliations:
  • Depto. de Arquitectura de Computadores y Automática, Universidad Complutense de Madrid, Spain;Dept. Applied Intelligent Systems, Universidad Politécnica de Madrid, Spain;Dept. Artificial Intelligence, Universidad Politécnica de Madrid, Spain

  • Venue:
  • Neurocomputing
  • Year:
  • 2011

Quantified Score

Hi-index 0.03

Visualization

Abstract

The aim of this paper is to present (jointly) a series of robust high performance (award winning) implementations of reinforcement learning algorithms based on temporal-difference learning and weighted k- nearest neighbors for linear function approximation. These algorithms, named kNN@?TD(@l) methods, where rigorously tested at the Second and Third Annual Reinforcement Learning Competitions (RLC2008 and RCL2009) held in Helsinki and Montreal respectively, where the kNN@?TD(@l) method (JAMH team) won in the PolyAthlon 2008 domain, obtained the second place in 2009 and also the second place in the Mountain-Car 2008 domain showing that it is one of the state of the art general purpose reinforcement learning implementations. These algorithms are able to learn quickly, to generalize properly over continuous state spaces and also to be robust to a high degree of environmental noise. Furthermore, we describe a derivation of kNN@?TD(@l) algorithm for problems where the use of continuous actions have clear advantages over the use of fine grained discrete actions: the Ex reinforcement learning algorithm.