Adaptive Behavior
Between MDPs and semi-MDPs: a framework for temporal abstraction in reinforcement learning
Artificial Intelligence
Multiagent learning using a variable learning rate
Artificial Intelligence
Discovering Hierarchy in Reinforcement Learning with HEXQ
ICML '02 Proceedings of the Nineteenth International Conference on Machine Learning
Advances in Neural Information Processing Systems 5, [NIPS Conference]
Efficient Exploration In Reinforcement Learning
Efficient Exploration In Reinforcement Learning
Hierarchical control and learning for markov decision processes
Hierarchical control and learning for markov decision processes
Learning to Communicate and Act Using Hierarchical Reinforcement Learning
AAMAS '04 Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems - Volume 3
An object-oriented representation for efficient reinforcement learning
Proceedings of the 25th international conference on Machine learning
Automatic discovery and transfer of MAXQ hierarchies
Proceedings of the 25th international conference on Machine learning
Hierarchical reinforcement learning with the MAXQ value function decomposition
Journal of Artificial Intelligence Research
Cooperation between multiple agents based on partially sharing policy
ICIC'07 Proceedings of the intelligent computing 3rd international conference on Advanced intelligent computing theories and applications
A Novel Anonymous RFID Authentication Protocol Providing Strong Privacy and Security
MINES '10 Proceedings of the 2010 International Conference on Multimedia Information Networking and Security
Structural abstraction experiments in reinforcement learning
AI'05 Proceedings of the 18th Australian Joint conference on Advances in Artificial Intelligence
Hi-index | 0.00 |
This paper compares and investigates single-agent reinforcement learning (RL) algorithms on the simple and an extended taxi problem domain, and multiagent RL algorithms on a multiagent extension of the simple taxi problem domain we created. In particular, we extend the Policy Hill Climbing (PHC) and the Win or Learn Fast-PHC (WoLF-PHC) algorithms by combining them with the MAXQ hierarchical decomposition and investigate their efficiency. The results are very promising for the multiagent domain as they indicate that these two newly-created algorithms are the most efficient ones from the algorithms we compared.