Optimizing coalition formation for tasks with dynamically evolving rewards and nondeterministic action effects

Authors:
Majid Ali Khan;Damla Turgut;Ladislau Bölöni
Affiliations:
Faculty of Computer Science and Engineering, Ghulam Ishaq Khan Institute of Engineering Science and Technology, Topi, Khyber Pakhtunkhwa, Pakistan;School of Electrical Engineering and Computer Science, University of Central Florida, Orlando, USA 32816---2450;School of Electrical Engineering and Computer Science, University of Central Florida, Orlando, USA 32816---2450
Venue:
Autonomous Agents and Multi-Agent Systems
Year:
2011

Citing 12
Cited 2

Solving very large weakly coupled Markov decision processes

AAAI '98/IAAI '98 Proceedings of the fifteenth national/tenth conference on Artificial intelligence/Innovative applications of artificial intelligence
Dynamic Coalition Formation among Rational Agents

IEEE Intelligent Systems
Bayesian Reinforcement Learning for Coalition Formation under Uncertainty

AAMAS '04 Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems - Volume 3
Coalition games and alternating temporal logics

TARK '01 Proceedings of the 8th conference on Theoretical aspects of rationality and knowledge
YAES: a modular simulator for mobile networks

MSWiM '05 Proceedings of the 8th ACM international symposium on Modeling, analysis and simulation of wireless and mobile systems
On the logic of coalitional games

AAMAS '06 Proceedings of the fifth international joint conference on Autonomous agents and multiagent systems
Temporal qualitative coalitional games

AAMAS '06 Proceedings of the fifth international joint conference on Autonomous agents and multiagent systems
Sequential decision making in repeated coalition formation under uncertainty

Proceedings of the 7th international joint conference on Autonomous agents and multiagent systems - Volume 1
Efficient solution algorithms for factored MDPs

Journal of Artificial Intelligence Research
Methods for task allocation via agent coalition formation

Artificial Intelligence
On the logic of cooperation and propositional control

Artificial Intelligence
Approaching urban disaster reality: the resq firesimulator

RoboCup 2004

On estimating actuation delays in elastic computing systems

Proceedings of the 8th International Symposium on Software Engineering for Adaptive and Self-Managing Systems
Non-additive multi-objective robot coalition formation

Expert Systems with Applications: An International Journal

Quantified Score

Hi-index	0.00

Visualization

Abstract

We consider a problem domain where coalitions of agents are formed in order to execute tasks. Each task is assigned at most one coalition of agents, and the coalition can be reorganized during execution. Executing a task means bringing it to one of the desired terminal states, which might take several time steps. The state of the task evolves even if no coalition is assigned to its execution and depends nondeterministically on the cumulative actions of the agents in the coalition. Furthermore, we assume that the reward obtained for executing a task evolves in time: the more the execution of the task is delayed, the lesser the reward. A representative example of this class of problems is the allocation of firefighters to fires in a disaster rescue environment. We describe a practical methodology through which a problem of this class can be encoded as a Markov Decision Process. Due to the three levels of factoring in the resulting MDP (the states, actions and rewards are composites of the original features of the problem) the resulting MDP can be directly solved only for small problem instances. We describe two methods for parallel decomposition of the MDP: the MDP RSUA approach for random sampling and uniform allocation and the MDP REUSE method which reuses the lower level MDP to allocate resources to the parallel subproblems. Through an experimental study which models the problem domain using the fire simulation components of the Robocup Rescue simulator, we show that both methods significantly outperform heuristic approaches and MDP REUSE provides an overall higher performance than MDP RSUA.