Multiagent learning using a variable learning rate
Artificial Intelligence
Convergence of Gradient Dynamics with a Variable Learning Rate
ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Nash Convergence of Gradient Dynamics in General-Sum Games
UAI '00 Proceedings of the 16th Conference on Uncertainty in Artificial Intelligence
A multiagent reinforcement learning algorithm with non-linear dynamics
Journal of Artificial Intelligence Research
Learning hybridization strategies in evolutionary algorithms
Intelligent Data Analysis
Hi-index | 0.00 |
This paper addresses a way to generate mixed strategies using reinforcement learning algorithms in domains with stochastic rewards. A new algorithm, based on Q-learning model, called TERSQ is introduced. As a difference from other approaches for stochastic scenarios, TERSQ uses a global exploration rate for all the state/actions in the same run. This exploration rate is selected at the beginning of each round, using a probabilistic distribution, which is updated once the run is finished. In this paper we compare TERSQ with similar approaches that use probability distributions depending on state-action pairs. Two experimental scenarios have been considered. First one deals with the problem of learning the optimal way to combine several evolutionary algorithms used simultaneously by a hybrid approach. In the second one, the objective is to learn the best strategy for a set of competing agents in combat-based videogame.