An online POMDP algorithm used by the policeforce agents in the robocuprescue simulation

Authors:
Sébastien Paquet;Ludovic Tobin;Brahim Chaib-draa
Affiliations:
DAMAS Laboratory, Laval University;DAMAS Laboratory, Laval University;DAMAS Laboratory, Laval University
Venue:
RoboCup 2005
Year:
2006

Citing 9
Cited 0

The complexity of Markov decision processes

Mathematics of Operations Research
Planning and Acting in Partially Observable Stochastic Domains

Planning and Acting in Partially Observable Stochastic Domains
RoboCup Rescue: A Grand Challenge for Multi-Agent Systems

ICMAS '00 Proceedings of the Fourth International Conference on MultiAgent Systems (ICMAS-2000)
Heuristic search value iteration for POMDPs

UAI '04 Proceedings of the 20th conference on Uncertainty in artificial intelligence
Exploiting structure to efficiently solve large scale partially observable markov decision processes

Exploiting structure to efficiently solve large scale partially observable markov decision processes
Stochastic local search for POMDP controllers

AAAI'04 Proceedings of the 19th national conference on Artifical intelligence
Point-based value iteration: an anytime algorithm for POMDPs

IJCAI'03 Proceedings of the 18th international joint conference on Artificial intelligence
Computing optimal policies for partially observable decision processes using compact representations

AAAI'96 Proceedings of the thirteenth national conference on Artificial intelligence - Volume 2
Tractable inference for complex stochastic processes

UAI'98 Proceedings of the Fourteenth conference on Uncertainty in artificial intelligence

Quantified Score

Hi-index	0.00

Visualization

Abstract

In the RoboCupRescue simulation, the PoliceForce agents have to decide which roads to clear to help other agents to navigate in the city. In this article, we present how we have modelled their environment as a POMDP and more importantly we present our new online POMDP algorithm enabling them to make good decisions in real-time during the simulation. Our algorithm is based on a look-ahead search to find the best action to execute at each cycle. We thus avoid the overwhelming complexity of computing a policy for each possible situation. To show the efficiency of our algorithm, we present some results on standard POMDPs and in the RoboCupRescue simulation environment.