An online multi-agent co-operative learning algorithm in POMDPs

Authors:
Fei Liu;Guangzhou Zeng
Affiliations:
School of Computer Science and Technology, Shandong University, Jinan, People's Republic of China;School of Computer Science and Technology, Shandong University, Jinan, People's Republic of China
Venue:
Journal of Experimental & Theoretical Artificial Intelligence
Year:
2008

Citing 11
Cited 0

Generalized best-first search strategies and the optimality of A*

Journal of the ACM (JACM)
Technical Note: \cal Q-Learning

Machine Learning
Planning and acting in partially observable stochastic domains

Artificial Intelligence
The Complexity of Decentralized Control of Markov Decision Processes

UAI '00 Proceedings of the 16th Conference on Uncertainty in Artificial Intelligence
Stochastic local search for POMDP controllers

AAAI'04 Proceedings of the 19th national conference on Artifical intelligence
Value-function approximations for partially observable Markov decision processes

Journal of Artificial Intelligence Research
The communicative multiagent team decision problem: analyzing teamwork theories and models

Journal of Artificial Intelligence Research
Point-based value iteration: an anytime algorithm for POMDPs

IJCAI'03 Proceedings of the 18th international joint conference on Artificial intelligence
An improved grid-based approximation algorithm for POMDPs

IJCAI'01 Proceedings of the 17th international joint conference on Artificial intelligence - Volume 1
Computing optimal policies for partially observable decision processes using compact representations

AAAI'96 Proceedings of the thirteenth national conference on Artificial intelligence - Volume 2
Model based Bayesian exploration

UAI'99 Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence

Quantified Score

Hi-index	0.00

Visualization

Abstract

Solving partially observable Markov decision processes (POMDPs) is a complex task that is often intractable. This paper examines the problem of finding an optimal policy for POMDPs. While a lot of effort has been made to develop algorithms to solve POMDPs, the question of automatically finding good low-dimensional spaces in multi-agent co-operative learning domains has not been explored thoroughly. To identify this question, an online algorithm CMEAS is presented to improve the POMDP model. This algorithm is based on a look-ahead search to find the best action to execute at each cycle. Thus the overwhelming complexity of computing a policy for each possible situation is avoided. A series of simulations demonstrate this good strategy and performance of the proposed algorithm when multiple agents co-operate to find an optimal policy for POMDPs.