Learning automata: an introduction
Learning automata: an introduction
Memoryless policies: theoretical limitations and practical results
SAB94 Proceedings of the third international conference on Simulation of adaptive behavior : from animals to animats 3: from animals to animats 3
The dynamics of reinforcement learning in cooperative multiagent systems
AAAI '98/IAAI '98 Proceedings of the fifteenth national/tenth conference on Artificial intelligence/Innovative applications of artificial intelligence
Gradient descent for general reinforcement learning
Proceedings of the 1998 conference on Advances in neural information processing systems II
Introduction to Reinforcement Learning
Introduction to Reinforcement Learning
Multiagent Reinforcement Learning: Theoretical Framework and an Algorithm
ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Learning Policies with External Memory
ICML '99 Proceedings of the Sixteenth International Conference on Machine Learning
ICML '99 Proceedings of the Sixteenth International Conference on Machine Learning
The Complexity of Decentralized Control of Markov Decision Processes
UAI '00 Proceedings of the 16th Conference on Uncertainty in Artificial Intelligence
Reinforcement learning: a survey
Journal of Artificial Intelligence Research
Sequential optimality and coordination in multiagent systems
IJCAI'99 Proceedings of the 16th international joint conference on Artifical intelligence - Volume 1
Learning finite-state controllers for partially observable environments
UAI'99 Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence
Coordinating teams in uncertain environments: a hybrid BDI-POMDP approach
ProMAS'04 Proceedings of the Second international conference on Programming Multi-Agent Systems
An optimal best-first search algorithm for solving infinite horizon DEC-POMDPs
ECML'05 Proceedings of the 16th European conference on Machine Learning
An overview of cooperative and competitive multiagent learning
LAMAS'05 Proceedings of the First international conference on Learning and Adaption in Multi-Agent Systems
Modeling cooperation in multi-agent communities
Cognitive Systems Research
Decentralised channel allocation and information sharing for teams of cooperative agents
Proceedings of the 11th International Conference on Autonomous Agents and Multiagent Systems - Volume 1
Continuous strategy replicator dynamics for multi-agent Q-learning
Autonomous Agents and Multi-Agent Systems
Teaching and leading an ad hoc teammate: Collaboration without pre-coordination
Artificial Intelligence
Multiagent meta-level control for radar coordination
Web Intelligence and Agent Systems
Hi-index | 0.00 |
Cooperative games are those in which both agents share the same payoff structure. Value-based reinforcement-learning algorithms, such as variants of Q-learning, have been applied to learning cooperative games, but they only apply when the game state is completely observable to both agents. Policy search methods are a reasonable alternative to value-based methods for partially observable environments. In this paper, we provide a gradient-based distributed policy-search method for cooperative games and compare the notion of local optimum to that of Nash equilibrium. We demonstrate the effectiveness of this method experimentally in a small, partially observable simulated soccer domain.