Reinforcement learning for cooperative actions in a partially observable multi-agent system

Authors:
Yuki Taniguchi;Takeshi Mori;Shin Ishii
Affiliations:
Graduate School of Information Science, Nara Institute of Science and Technology, Ikoma, Japan;Graduate School of Information Science, Nara Institute of Science and Technology, Ikoma, Japan;Graduate School of Information Science, Nara Institute of Science and Technology, Ikoma, Japan
Venue:
ICANN'07 Proceedings of the 17th international conference on Artificial neural networks
Year:
2007

Citing 9
Cited 2

Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning

Machine Learning
Introduction to Reinforcement Learning

Introduction to Reinforcement Learning
Multiagent Systems: A Survey from a Machine Learning Perspective

Autonomous Robots
Scalable Internal-State Policy-Gradient Methods for POMDPs

ICML '02 Proceedings of the Nineteenth International Conference on Machine Learning
Reinforcement Learning for 3 vs. 2 Keepaway

RoboCup 2000: Robot Soccer World Cup IV
Value-function approximations for partially observable Markov decision processes

Journal of Artificial Intelligence Research
Infinite-horizon policy-gradient estimation

Journal of Artificial Intelligence Research
Planning and acting in partially observable stochastic domains

Artificial Intelligence
System identification based on online variational bayes method and its application to reinforcement learning

ICANN/ICONIP'03 Proceedings of the 2003 joint international conference on Artificial neural networks and neural information processing

A Continuous Internal-State Controller for Partially Observable Markov Decision Processes

ICANN '08 Proceedings of the 18th international conference on Artificial Neural Networks, Part I
A multi-agent reinforcement learning approach to robot soccer

Artificial Intelligence Review

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this article, we apply a policy gradient-based reinforcement learning to allowing multiple agents to perform cooperative actions in a partially observable environment. We introduce an auxiliary state variable, an internal state, whose stochastic process is Markov, for extracting important features of multi-agent's dynamics. Computer simulations show that every agent can identify an appropriate internal state model and acquire a good policy; this approach is shown to be more effective than a traditional memory-based method.