Learning Team Strategies With Multiple Policy-Sharing Agents: A Soccer Case Study

Authors:
Rafal Salustowicz;Marco Wiering;Juergen Schmidhuber
Affiliations:
-;-;-
Venue:
Learning Team Strategies With Multiple Policy-Sharing Agents: A Soccer Case Study
Year:
1997

Citing 0
Cited 2

Cooperative Multi-Agent Learning: The State of the Art

Autonomous Agents and Multi-Agent Systems
An overview of cooperative and competitive multiagent learning

LAMAS'05 Proceedings of the First international conference on Learning and Adaption in Multi-Agent Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

We use simulated soccer to study multiagent learning. Each team''s players (agents) share action set and policy but may behave differently due to position-dependent inputs. All agents making up a team are rewarded or punished collectively in case of goals. We conduct simulations with varying team sizes, and compare two learning algorithms: TD-Q learning with linear neural networks (TD-Q) and Probabilistic Incremental Program Evolution (PIPE). TD-Q is based on evaluation functions (EFs) mapping input/action pairs to expected reward, while PIPE searches policy space directly. PIPE uses adaptive ``probabilistic prototype trees'''' to synthesize programs that calculate action probabilities from current inputs. Our results show that TD-Q encounters several difficulties in learning appropriate shared EFs. PIPE, however, does not depend on EFs and can find good policies faster and more reliably. This suggests that in multiagent learning scenarios direct search through policy space can offer advantages over EF-based approaches.