Tentative Exploration on Reinforcement Learning Algorithms for Stochastic Rewards

Authors:
Luis Peña;Antonio Latorre;José-María Peña;Sascha Ossowski
Affiliations:
Artificial Intelligence Department, Universidad Rey Juan Carlos,;Computer Architecture Department, Universidad Politécnica de Madrid,;Computer Architecture Department, Universidad Politécnica de Madrid,;Artificial Intelligence Department, Universidad Rey Juan Carlos,
Venue:
HAIS '09 Proceedings of the 4th International Conference on Hybrid Artificial Intelligence Systems
Year:
2009

Citing 4
Cited 1

Multiagent learning using a variable learning rate

Artificial Intelligence
Convergence of Gradient Dynamics with a Variable Learning Rate

ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Nash Convergence of Gradient Dynamics in General-Sum Games

UAI '00 Proceedings of the 16th Conference on Uncertainty in Artificial Intelligence
A multiagent reinforcement learning algorithm with non-linear dynamics

Journal of Artificial Intelligence Research

Learning hybridization strategies in evolutionary algorithms

Intelligent Data Analysis

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper addresses a way to generate mixed strategies using reinforcement learning algorithms in domains with stochastic rewards. A new algorithm, based on Q-learning model, called TERSQ is introduced. As a difference from other approaches for stochastic scenarios, TERSQ uses a global exploration rate for all the state/actions in the same run. This exploration rate is selected at the beginning of each round, using a probabilistic distribution, which is updated once the run is finished. In this paper we compare TERSQ with similar approaches that use probability distributions depending on state-action pairs. Two experimental scenarios have been considered. First one deals with the problem of learning the optimal way to combine several evolutionary algorithms used simultaneously by a hybrid approach. In the second one, the objective is to learn the best strategy for a set of competing agents in combat-based videogame.