The analysis and design of concurrent learning algorithms for cooperative multiagent systems

Authors:
Sean Luke;Liviu Panait
Affiliations:
George Mason University;George Mason University
Venue:
The analysis and design of concurrent learning algorithms for cooperative multiagent systems
Year:
2007

Citing 0
Cited 1

Cooperative coevolution and univariate estimation of distribution algorithms

Proceedings of the tenth ACM SIGEVO workshop on Foundations of genetic algorithms

Quantified Score

Hi-index	0.00

Visualization

Abstract

Concurrent learning is the application of several machine learning algorithms in parallel to automatically discover behaviors for teams of agents. As machine learning techniques tend to find better and better solutions if they are allowed additional time and resources, the same would be expected from concurrent learning algorithms. Surprisingly, previous empirical and theoretical analysis has shown this not to be the case. Instead, concurrent learning algorithms often drift towards suboptimal solutions due to the learners' coadaptation to one another. This phenomenon is a significant obstacle to using these techniques to discover optimal team behaviors. This thesis presents theoretical and empirical research on what causes the aforementioned drift, as well as on how to minimize it altogether. I present evidence that the drift often occurs because learners have poor estimates for the quality of their possible behaviors. Interestingly, improving a learner's quality estimate does not require more sophisticated sensing capabilities; rather, it can be simply achieved if the learner ignores certain reward information that it received for performing actions. I provide formal proofs that concurrent learning algorithms will converge to the globally optimal solution, provided that each learner has sufficiently accurate estimates. This theoretical analysis provides the foundation for the design of novel concurrent learning algorithms that benefit from accurate quality estimates. First, the estimates of learners employing the biased cooperative coevolutionary algorithm are greatly improved based on reward information that was received in the past. Second, the informative cooperative coevolutionary algorithm provides learners with simpler, functionally-equivalent estimates at a reduced computational cost. Finally, I describe the lenient multiagent Q-learning algorithm, which benefits from more accurate estimates when tackling challenging coordination tasks in stochastic domains.