Planning under uncertainty in complex structured environments

Authors:
Daphne Koller;Carlos Ernesto Guestrin
Affiliations:
-;-
Venue:
Planning under uncertainty in complex structured environments
Year:
2003

Citing 0
Cited 12

Sparse cooperative Q-learning

ICML '04 Proceedings of the twenty-first international conference on Machine learning
Intelligent light control using sensor networks

Proceedings of the 3rd international conference on Embedded networked sensor systems
Resource allocation among agents with preferences induced by factored MDPs

AAMAS '06 Proceedings of the fifth international joint conference on Autonomous agents and multiagent systems
Collaborative Multiagent Reinforcement Learning by Payoff Propagation

The Journal of Machine Learning Research
Using the Simulated Annealing Algorithm for Multiagent Decision Making

RoboCup 2006: Robot Soccer World Cup X
Approximate linear-programming algorithms for graph-based Markov decision processes

Proceedings of the 2006 conference on ECAI 2006: 17th European Conference on Artificial Intelligence August 29 -- September 1, 2006, Riva del Garda, Italy
Mean Field Approximation of the Policy Iteration Algorithm for Graph-based Markov Decision Processes

Proceedings of the 2006 conference on ECAI 2006: 17th European Conference on Artificial Intelligence August 29 -- September 1, 2006, Riva del Garda, Italy
Hard constrained semi-Markov decision processes

AAAI'06 Proceedings of the 21st national conference on Artificial intelligence - Volume 1
Solving factored MDPs with hybrid state and action variables

Journal of Artificial Intelligence Research
Resource allocation among agents with MDP-induced preferences

Journal of Artificial Intelligence Research
Using mathematical programming to solve Factored Markov Decision Processes with Imprecise Probabilities

International Journal of Approximate Reasoning
Coordinating multi-agent reinforcement learning with limited communication

Proceedings of the 2013 international conference on Autonomous agents and multi-agent systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

Many real-world tasks require multiple decision makers (agents) to coordinate their actions in order to achieve common long-term goals. Examples include: manufacturing systems, where managers of a factory coordinate to maximize profit; rescue robots that, after an earthquake, must safely find victims as fast as possible; or sensor networks, where multiple sensors collaborate to perform a large-scale sensing task under strict power constraints. All of these tasks require the solution of complex long-term multiagent planning problems in uncertain dynamic environments. Factored Markov decision processes (MDPs) allow us to represent complex uncertain dynamic systems very compactly by exploiting problem-specific structure. Specifically, the state of the system is described by a set of variables that evolve stochastically over time using a representation, called dynamic Bayesian network, that often allows for an exponential reduction in representation complexity. However, the complexity of exact solution algorithms for such MDPs grows exponentially in the number of variables, and in the number of agents. This thesis builds a formal framework and approximate planning algorithms that exploit structure in factored MDPs to solve problems with many trillions of states and actions very efficiently. The main contributions of this thesis include: Factored linear programs: A novel LP decomposition technique, using ideas from inference in Bayesian networks, that can exploit problem structure to reduce exponentially-large LPs to polynomially-sized ones that are provably equivalent. Factored approximate planning: A suite of algorithms, building on our factored LP decomposition technique, that exploit structure in factored MDPs to obtain exponential reductions in planning time.Distributed coordination: An efficient distributed multiagent decision making algorithm, where the coordination structure arises naturally from the factored representation of the system dynamics. Generalization in relational MDPs: A framework for obtaining general solutions from a small set of environments, allowing agents to act in new environments without replanning. Empirical evaluation: A detailed evaluation on a variety of large-scale tasks, including multiagent coordination in a real strategic computer game, demonstrating that our formal framework yields effective plans, complex agent coordination, and successful generalization in some of the largest planning problems in the literature.