Reinforcement distribution in fuzzy Q-learning

Authors:
Andrea Bonarini;Alessandro Lazaric;Francesco Montrone;Marcello Restelli
Affiliations:
Department of Electronics and Information, Politecnico di Milano, Milan, Italy;Department of Electronics and Information, Politecnico di Milano, Milan, Italy;Department of Electronics and Information, Politecnico di Milano, Milan, Italy;Department of Electronics and Information, Politecnico di Milano, Milan, Italy
Venue:
Fuzzy Sets and Systems
Year:
2009

Citing 20
Cited 3

Radial basis functions for multivariable interpolation: a review

Algorithms for approximation
Technical Note: \cal Q-Learning

Machine Learning
The Convergence of TD(λ) for General λ

Machine Learning
A reinforcement learning-based architecture for fuzzy logic control

International Journal of Approximate Reasoning - Special issue on fuzzy logic and neural networks for pattern recognition and control
TD(λ) Converges with Probability 1

Machine Learning
Learning to act using real-time dynamic programming

Artificial Intelligence - Special volume on computational research on interaction and agency, part 1
Temporal difference learning and TD-Gammon

Communications of the ACM
Machine Learning

Machine Learning
Introduction to Reinforcement Learning

Introduction to Reinforcement Learning
Neuro-Dynamic Programming

Neuro-Dynamic Programming
Multistage Fuzzy Control: A Prescriptive Approach

Multistage Fuzzy Control: A Prescriptive Approach
Continuous-Action Q-Learning

Machine Learning
Learning to Predict by the Methods of Temporal Differences

Machine Learning
Feudal Reinforcement Learning

Advances in Neural Information Processing Systems 5, [NIPS Conference]
Dynamic Programming

Dynamic Programming
Adaptive Radial Basis Decomposition by Learning Vector Quantization

Neural Processing Letters
On characteristics of markov decision processes and reinforcement learning in large domains

On characteristics of markov decision processes and reinforcement learning in large domains
On the convergence of stochastic iterative dynamic programming algorithms

Neural Computation
Function approximation via tile coding: automating parameter choice

SARA'05 Proceedings of the 6th international conference on Abstraction, Reformulation and Approximation
Online tuning of fuzzy inference systems using dynamic fuzzy Q-learning

IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics

Fuzzy Q(λ)-learning algorithm

ICAISC'10 Proceedings of the 10th international conference on Artificial intelligence and soft computing: Part I
Fuzzy epoch-incremental reinforcement learning algorithm

ICAISC'12 Proceedings of the 11th international conference on Artificial Intelligence and Soft Computing - Volume Part I
Distributed self-learning scheduling approach for wireless sensor network

Ad Hoc Networks

Quantified Score

Hi-index	0.20

Visualization

Abstract

Q-learning is one of the most popular reinforcement learning methods that allows an agent to learn the relationship between interval-valued state and action spaces, through a direct interaction with the environment. Fuzzy Q-learning is an extension to this algorithm to enable it to evolve fuzzy inference systems (FIS) which range on continuous state and action spaces. In a FIS, the interaction among fuzzy rules plays a primary role to achieve good performance and robustness. Learning a system where this interaction is present gives to the learning mechanism problems due to eventually incoherent reinforcements coming to the same rule due to its interaction with other rules. In this paper, we will introduce different strategies to distribute reinforcement to reduce this undesired effect and to stabilize the obtained reinforcement. In particular, we will present two strategies: the former focuses on rewarding the actions chosen by each rule during the cooperation phase, the latter on rewarding the rules presenting actions closer to those actually executed rather than the rules that contributed to generate such actions.