Learning via human feedback in continuous state and action spaces

  • Authors:
  • Ngo Anh Vien;Wolfgang Ertel;Tae Choong Chung

  • Affiliations:
  • Institute of Artificial Intelligence, Ravensburg-Weingarten University of Applied Sciences, Weingarten, Germany 88250;Institute of Artificial Intelligence, Ravensburg-Weingarten University of Applied Sciences, Weingarten, Germany 88250;Department of Computer Engineering, Kyung Hee University, Seoul, South Korea

  • Venue:
  • Applied Intelligence
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper considers the problem of extending Training an Agent Manually via Evaluative Reinforcement (TAMER) in continuous state and action spaces. Investigative research using the TAMER framework enables a non-technical human to train an agent through a natural form of human feedback (negative or positive). The advantages of TAMER have been shown on tasks of training agents by only human feedback or combining human feedback with environment rewards. However, these methods are originally designed for discrete state-action, or continuous state-discrete action problems. This paper proposes an extension of TAMER to allow both continuous states and actions, called ACTAMER. The new framework utilizes any general function approximation of a human trainer's feedback signal. Moreover, a combined capability of ACTAMER and reinforcement learning is also investigated and evaluated. The combination of human feedback and reinforcement learning is studied in both settings: sequential and simultaneous. Our experimental results demonstrate the proposed method successfully allowing a human to train an agent in two continuous state-action domains: Mountain Car and Cart-pole (balancing).