Multi-agent Learning Dynamics: A Survey
CIA '07 Proceedings of the 11th international workshop on Cooperative Information Agents XI
Analyzing the dynamics of stigmergetic interactions through pheromone games
Theoretical Computer Science
Taking turns in general sum Markov games
Proceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems: volume 1 - Volume 1
Networks of learning automata and limiting games
ALAMAS'05/ALAMAS'06/ALAMAS'07 Proceedings of the 5th , 6th and 7th European conference on Adaptive and learning agents and multi-agent systems: adaptation and multi-agent learning
A new class of ε-optimal learning automata
ICIC'11 Proceedings of the 7th international conference on Advanced Intelligent Computing
Hi-index | 0.00 |
A feedforward network composed of units of teams of parameterized learning automata is considered as a model of a reinforcement learning system. The internal state vector of each learning automaton is updated using an algorithm consisting of a gradient-following term and a random perturbation term. It is shown that the algorithm weakly converges to a solution of the Langevin equation, implying that the algorithm globally maximizes an appropriate function. The algorithm is decentralized, and the units do not have any information exchange during updating. Simulation results on common payoff games and pattern recognition problems show that reasonable rates of convergence can be obtained