Reinforcement distribution in continuous state action space fuzzy Q–learning: a novel approach

  • Authors:
  • Andrea Bonarini;Francesco Montrone;Marcello Restelli

  • Affiliations:
  • Politecnico di Milano Electronic and Information Department, Milan, Italy;Politecnico di Milano Electronic and Information Department, Milan, Italy;Politecnico di Milano Electronic and Information Department, Milan, Italy

  • Venue:
  • WILF'05 Proceedings of the 6th international conference on Fuzzy Logic and Applications
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

Fuzzy Q–learning extends the Q–learning algorithm to work in presence of continuous state and action spaces. A Takagi–Sugeno Fuzzy Inference System (FIS) is used to infer the continuous executed action and its action–value, by means of cooperation of several rules. Different kinds of evolution of the parameters of the FIS are possible, depending on different strategies of distribution of the reinforcement signal. In this paper, we compare two strategies: the classical one, focusing on rewarding the rules that have proposed the actions composed to produce the actual action, and a new one we are introducing, where reward goes to the rules proposing actions closest the ones actually executed.