A Reinforcement Learning Scheme for a Partially-Observable Multi-Agent Game

Authors:
Shin Ishii;Hajime Fujita;Masaoki Mitsutake;Tatsuya Yamazaki;Jun Matsuda;Yoichiro Matsuno
Affiliations:
Nara Institute of Science and Technology, CREST, Japan Science and Technology Agency, Ikoma, Japan 630-0192;Nara Institute of Science and Technology, Ikoma, Japan 630-0192;Nara Institute of Science and Technology, Ikoma, Japan 630-0192;National Institute of Information and Communications Technology, Kyoto, Japan 619-0289;Osaka Gakuin University, Suita, Japan 564-8511;Ricoh Co. Ltd., Tokyo, Japan 112-0002
Venue:
Machine Learning
Year:
2005

Citing 17
Cited 3

Prioritized Sweeping: Reinforcement Learning with Less Data and Less Time

Machine Learning
Learning to coordinate without sharing information

AAAI '94 Proceedings of the twelfth national conference on Artificial intelligence (vol. 1)
TD-Gammon, a self-teaching backgammon program, achieves master-level play

Neural Computation
Reinforcement learning of non-Markov decision processes

Artificial Intelligence - Special volume on computational research on interaction and agency, part 2
Planning and acting in partially observable stochastic domains

Artificial Intelligence
Elevator Group Control Using Multiple Reinforcement Learning Agents

Machine Learning
Learning Team Strategies: Soccer Case Studies

Machine Learning
A multi-agent reinforcement learning method for a partially-observable competitive game

Proceedings of the fifth international conference on Autonomous agents
Introduction to Reinforcement Learning

Introduction to Reinforcement Learning
Control of exploitation-exploration meta-parameter in reinforcement learning

Neural Networks - Computational models of neuromodulation
Multiagent Reinforcement Learning: Theoretical Framework and an Algorithm

ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Memory Approaches to Reinforcement Learning in Non-Markovian Domains

Memory Approaches to Reinforcement Learning in Non-Markovian Domains
Multi-Agent Reinforcement Learning: An Approach Based on the Other Agent's Internal Model

ICMAS '00 Proceedings of the Fourth International Conference on MultiAgent Systems (ICMAS-2000)
Reinforcement learning with selective perception and hidden state

Reinforcement learning with selective perception and hidden state
Large-scale dynamic optimization using teams of reinforcement learning agents

Large-scale dynamic optimization using teams of reinforcement learning agents
On-line EM Algorithm for the Normalized Gaussian Network

Neural Computation
GIB: imperfect information in a computationally challenging game

Journal of Artificial Intelligence Research

Model-Based Reinforcement Learning for Partially Observable Games with Sampling-Based State Estimation

Neural Computation
Strategy-acquisition system for video trading card game

ACE '08 Proceedings of the 2008 International Conference on Advances in Computer Entertainment Technology
Feature extraction for decision-theoretic planning in partially observable environments

ICANN'06 Proceedings of the 16th international conference on Artificial Neural Networks - Volume Part I

Quantified Score

Hi-index	0.00

Visualization

Abstract

We formulate an automatic strategy acquisition problem for the multi-agent card game "Hearts" as a reinforcement learning problem. The problem can approximately be dealt with in the framework of a partially observable Markov decision process (POMDP) for a single-agent system. Hearts is an example of imperfect information games, which are more difficult to deal with than perfect information games. A POMDP is a decision problem that includes a process for estimating unobservable state variables. By regarding missing information as unobservable state variables, an imperfect information game can be formulated as a POMDP. However, the game of Hearts is a realistic problem that has a huge number of possible states, even when it is approximated as a single-agent system. Therefore, further approximation is necessary to make the strategy acquisition problem tractable. This article presents an approximation method based on estimating unobservable state variables and predicting the actions of the other agents. Simulation results show that our reinforcement learning method is applicable to such a difficult multi-agent problem.