Solving very large weakly coupled Markov decision processes

Authors:
Nicolas Meuleau;Milos Hauskrecht;Kee-Eung Kim;Leonid Peshkin;Leslie Pack Kaelbling;Thomas Dean;Craig Boutilier
Affiliations:
-;-;-;-;-;-;-
Venue:
AAAI '98/IAAI '98 Proceedings of the fifteenth national/tenth conference on Artificial intelligence/Innovative applications of artificial intelligence
Year:
1998

Citing 13
Cited 38

A model for reasoning about persistence and causation

Computational Intelligence
Planning under time constraints in stochastic domains

Artificial Intelligence - Special volume on planning and scheduling
Abstraction and approximate decision-theoretic planning

Artificial Intelligence
Multi-time models for temporally abstract planning

NIPS '97 Proceedings of the 1997 conference on Advances in neural information processing systems 10
How to dynamically merge Markov decision processes

NIPS '97 Proceedings of the 1997 conference on Advances in neural information processing systems 10
Markov Decision Processes: Discrete Stochastic Dynamic Programming

Markov Decision Processes: Discrete Stochastic Dynamic Programming
Neuro-Dynamic Programming

Neuro-Dynamic Programming
Dynamic Programming

Dynamic Programming
Prioritized goal decomposition of Markov decision processes: toward a synthesis of classical and decision theoretic planning

IJCAI'97 Proceedings of the Fifteenth international joint conference on Artifical intelligence - Volume 2
Exploiting structure in policy construction

IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 2
Decomposition techniques for planning in stochastic domains

IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 2
Learning to act using real-time dynamic programming

Artificial Intelligence
Model minimization in Markov decision processes

AAAI'97/IAAI'97 Proceedings of the fourteenth national conference on artificial intelligence and ninth conference on Innovative applications of artificial intelligence

A Sparse Sampling Algorithm for Near-Optimal Planning in Large Markov Decision Processes

Machine Learning
Towards Stochastic Constraint Programming: A Study of Online Multi-choice Knapsack with Deadlines

CP '01 Proceedings of the 7th International Conference on Principles and Practice of Constraint Programming
Spatiotemporal Abstraction of Stochastic Sequential Processes

Proceedings of the 5th International Symposium on Abstraction, Reformulation and Approximation
Piecewise linear value function approximation for factored MDPs

Eighteenth national conference on Artificial intelligence
Reinforcement Learning with Factored States and Actions

The Journal of Machine Learning Research
Planning and programming with first-order markov decision processes: insights and challenges

TARK '01 Proceedings of the 8th conference on Theoretical aspects of rationality and knowledge
Computationally-efficient combinatorial auctions for resource allocation in weakly-coupled MDPs

Proceedings of the fourth international joint conference on Autonomous agents and multiagent systems
Resource allocation among agents with preferences induced by factored MDPs

AAMAS '06 Proceedings of the fifth international joint conference on Autonomous agents and multiagent systems
Sequential resource allocation in multiagent systems with uncertainties

Proceedings of the 6th international joint conference on Autonomous agents and multiagent systems
Assisting elders via dynamic multi-tasks planning: a Markov decision processes based approach

Proceedings of the 1st international conference on Ambient media and systems
Interaction-driven Markov games for decentralized multiagent planning under uncertainty

Proceedings of the 7th international joint conference on Autonomous agents and multiagent systems - Volume 1
Practical solution techniques for first-order MDPs

Artificial Intelligence
Solving multiagent assignment Markov decision processes

Proceedings of The 8th International Conference on Autonomous Agents and Multiagent Systems - Volume 1
Solving concurrent Markov decision processes

AAAI'04 Proceedings of the 19th national conference on Artifical intelligence
Online resource allocation using decompositional reinforcement learning

AAAI'05 Proceedings of the 20th national conference on Artificial intelligence - Volume 2
Efficient solution algorithms for factored MDPs

Journal of Artificial Intelligence Research
Resource allocation among agents with MDP-induced preferences

Journal of Artificial Intelligence Research
Proactive algorithms for job shop scheduling with probabilistic durations

Journal of Artificial Intelligence Research
Planning with durative actions in stochastic domains

Journal of Artificial Intelligence Research
Efficient reinforcement learning in factored MDPs

IJCAI'99 Proceedings of the 16th international joint conference on Artificial intelligence - Volume 2
Computing near optimal strategies for stochastic investment planning problems

IJCAI'99 Proceedings of the 16th international joint conference on Artificial intelligence - Volume 2
A sparse sampling algorithm for near-optimal planning in large Markov decision processes

IJCAI'99 Proceedings of the 16th international joint conference on Artificial intelligence - Volume 2
Computing factored value functions for policies in structured MDPs

IJCAI'99 Proceedings of the 16th international joint conference on Artificial intelligence - Volume 2
Using Conditional Random Fields for Decision-Theoretic Planning

MDAI '09 Proceedings of the 6th International Conference on Modeling Decisions for Artificial Intelligence
Partially Observable Markov Decision Processes: A Geometric Technique and Analysis

Operations Research
Resource-driven mission-phasing techniques for constrained agents in stochastic environments

Journal of Artificial Intelligence Research
Optimizing coalition formation for tasks with dynamically evolving rewards and nondeterministic action effects

Autonomous Agents and Multi-Agent Systems
Decentralized MDPs with sparse interactions

Artificial Intelligence
Towards a unifying characterization for quantifying weak coupling in dec-POMDPs

The 10th International Conference on Autonomous Agents and Multiagent Systems - Volume 1
Solving efficiently Decentralized MDPs with temporal and resource constraints

Autonomous Agents and Multi-Agent Systems
Planning under continuous time and resource uncertainty: a challenge for AI

UAI'02 Proceedings of the Eighteenth conference on Uncertainty in artificial intelligence
Distributed planning in hierarchical factored MDPs

UAI'02 Proceedings of the Eighteenth conference on Uncertainty in artificial intelligence
A clustering approach to solving large stochastic matching problems

UAI'01 Proceedings of the Seventeenth conference on Uncertainty in artificial intelligence
Monte-Carlo optimizations for resource allocation problems in stochastic network systems

UAI'03 Proceedings of the Nineteenth conference on Uncertainty in Artificial Intelligence
Optimal index rules for single resource allocation to stochastic dynamic competitors

Proceedings of the 5th International ICST Conference on Performance Evaluation Methodologies and Tools
People, sensors, decisions: Customizable and adaptive technologies for assistance in healthcare

ACM Transactions on Interactive Intelligent Systems (TiiS) - Special issue on highlights of the decade in interactive intelligent systems
Lagrangian Relaxation for Large-Scale Multi-agent Planning

WI-IAT '12 Proceedings of the The 2012 IEEE/WIC/ACM International Joint Conferences on Web Intelligence and Intelligent Agent Technology - Volume 02
Decentralized coordination via task decomposition and reward shaping

Proceedings of the 2013 international conference on Autonomous agents and multi-agent systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

We present a technique for computing approximately optimal solutions to stochastic resource allocation problems modeled as Markov decision processes (MDPS). We exploit two key properties to avoid explicitly enumerating the very large state and action spaces associated with these problems. Fist, the problems are composed of multiple tasks whose utilities are independent. Second, the actions taken with respect to (or resources allocated to) a task do not influence the status of any other task. We can therefore view each task as an MDP. However these MDPS are weakly coupled by resource constraints: actions selected for one MDP restrict the actions available to others. We describe heuristic techniques for dealing with several classes of constraints that use the solutions for individual MDPS to construct an approximate global solution. We demonstrate this technique on problems involving thousands of tasks, approximating the solution to problems that are far beyond the reach of standard methods.