Elevator Group Control Using Multiple Reinforcement Learning Agents
Machine Learning
Bounded-parameter Markov decision process
Artificial Intelligence
Introduction to Reinforcement Learning
Introduction to Reinforcement Learning
Neural Computation
Discovering Hierarchy in Reinforcement Learning with HEXQ
ICML '02 Proceedings of the Nineteenth International Conference on Machine Learning
Model Minimization in Hierarchical Reinforcement Learning
Proceedings of the 5th International Symposium on Abstraction, Reformulation and Approximation
Hierarchical reinforcement learning with the MAXQ value function decomposition
Journal of Artificial Intelligence Research
SMDP homomorphisms: an algebraic approach to abstraction in semi-Markov decision processes
IJCAI'03 Proceedings of the 18th international joint conference on Artificial intelligence
Model minimization in Markov decision processes
AAAI'97/IAAI'97 Proceedings of the fourteenth national conference on artificial intelligence and ninth conference on Innovative applications of artificial intelligence
Memory based on abstraction for dynamic fitness functions
Evo'08 Proceedings of the 2008 conference on Applications of evolutionary computing
An extension of a hierarchical reinforcement learning algorithm for multiagent settings
EWRL'11 Proceedings of the 9th European conference on Recent Advances in Reinforcement Learning
Hi-index | 0.00 |
A challenge in applying reinforcement learning to large problems is how to manage the explosive increase in storage and time complexity. This is especially problematic in multi-agent systems, where the state space grows exponentially in the number of agents. Function approximation based on simple supervised learning is unlikely to scale to complex domains on its own, but structural abstraction that exploits system properties and problem representations shows more promise. In this paper, we investigate several classes of known abstractions: 1) symmetry, 2) decomposition into multiple agents, 3) hierarchical decomposition, and 4) sequential execution. We compare memory requirements, learning time, and solution quality empirically in two problem variations. Our results indicate that the most effective solutions come from combinations of structural abstractions, and encourage development of methods for automatic discovery in novel problem formulations.