A model for reasoning about persistence and causation
Computational Intelligence
Solving very large weakly coupled Markov decision processes
AAAI '98/IAAI '98 Proceedings of the fifteenth national/tenth conference on Artificial intelligence/Innovative applications of artificial intelligence
Reinforcement learning with hierarchies of machines
NIPS '97 Proceedings of the 1997 conference on Advances in neural information processing systems 10
How to dynamically merge Markov decision processes
NIPS '97 Proceedings of the 1997 conference on Advances in neural information processing systems 10
Between MDPs and semi-MDPs: a framework for temporal abstraction in reinforcement learning
Artificial Intelligence
Markov Decision Processes: Discrete Stochastic Dynamic Programming
Markov Decision Processes: Discrete Stochastic Dynamic Programming
Learning to Drive a Bicycle Using Reinforcement Learning and Shaping
ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Hierarchical reinforcement learning with the MAXQ value function decomposition
Journal of Artificial Intelligence Research
Computing factored value functions for policies in structured MDPs
IJCAI'99 Proceedings of the 16th international joint conference on Artificial intelligence - Volume 2
Decomposition techniques for planning in stochastic domains
IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 2
Hierarchical solution of Markov decision processes using macro-actions
UAI'98 Proceedings of the Fourteenth conference on Uncertainty in artificial intelligence
Object-oriented Bayesian networks
UAI'97 Proceedings of the Thirteenth conference on Uncertainty in artificial intelligence
Coarticulation: an approach for generating concurrent plans in Markov decision processes
ICML '05 Proceedings of the 22nd international conference on Machine learning
Dimensions of complexity of intelligent agents
PCAR '06 Proceedings of the 2006 international symposium on Practical cognitive agents and robots
Agendas for multi-agent learning
Artificial Intelligence
Exploiting factored representations for decentralized execution in multiagent teams
Proceedings of the 6th international joint conference on Autonomous agents and multiagent systems
No-regret learning and a mechanism for distributed multiagent planning
Proceedings of the 7th international joint conference on Autonomous agents and multiagent systems - Volume 1
Decentralized control of cooperative systems: categorization and complexity analysis
Journal of Artificial Intelligence Research
Communication-based decomposition mechanisms for decentralized MDPs
Journal of Artificial Intelligence Research
Decision Support in Organizations: A Case for OrgPOMDPs
WI-IAT '11 Proceedings of the 2011 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology - Volume 02
Social Model Shaping for Solving Generic DEC-POMDPs
WI-IAT '11 Proceedings of the 2011 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology - Volume 02
Lazy auctions for multi-robot collision avoidance and motion control under uncertainty
AAMAS'11 Proceedings of the 10th international conference on Advanced Agent Technology
People, sensors, decisions: Customizable and adaptive technologies for assistance in healthcare
ACM Transactions on Interactive Intelligent Systems (TiiS) - Special issue on highlights of the decade in interactive intelligent systems
Distributed tree decomposition with privacy
CP'12 Proceedings of the 18th international conference on Principles and Practice of Constraint Programming
Lagrangian Relaxation for Large-Scale Multi-agent Planning
WI-IAT '12 Proceedings of the The 2012 IEEE/WIC/ACM International Joint Conferences on Web Intelligence and Intelligent Agent Technology - Volume 02
Hi-index | 0.00 |
We present a principled and efficient planning algorithm for collaborative multiagent dynamical systems. All computation, during both the planning and the execution phases, is distributed among the agents; each agent only needs to model and plan for a small part of the system. Each of these local subsystems is small, but once they are combined they can represent an exponentially larger problem. The subsystems are connected through a subsystem hierarchy. Coordination and communication between the agents is not imposed, but derived directly from the structure of this hierarchy. A globally consistent plan is achieved by a message passing algorithm, where messages correspond to natural local reward functions and are computed by local linear programs; another message passing algorithm allows us to execute the resulting policy. When two portions of the hierarchy share the same structure, our algorithm can reuse plans and messages to speed up computation.