Automatic programming of behavior-based robots using reinforcement learning
Artificial Intelligence
Technical Note: \cal Q-Learning
Machine Learning
Learning in embedded systems
Reinforcement learning algorithms for average-payoff Markovian decision processes
AAAI '94 Proceedings of the twelfth national conference on Artificial intelligence (vol. 1)
Average reward reinforcement learning: foundations, algorithms, and empirical results
Machine Learning - Special issue on reinforcement learning
Machine Learning - Special issue on reinforcement learning
Dynamic Programming and Optimal Control
Dynamic Programming and Optimal Control
Markov Decision Processes: Discrete Stochastic Dynamic Programming
Markov Decision Processes: Discrete Stochastic Dynamic Programming
H-Learning: A Reinforcement Learning Method for Optimizing Undiscounted Average Reward
H-Learning: A Reinforcement Learning Method for Optimizing Undiscounted Average Reward
Learning to act using real-time dynamic programming
Artificial Intelligence
An average-reward reinforcement learning algorithm for computing bias-optimal policies
AAAI'96 Proceedings of the thirteenth national conference on Artificial intelligence - Volume 1
Hierarchical Average Reward Reinforcement Learning
The Journal of Machine Learning Research
Hi-index | 0.00 |
We introduce a model-based average reward Reinforcement Learning method called H-learning and compare it with its discounted counterpart, Adaptive Real-Time Dynamic Programming, in a simulated robot scheduling task. We also introduce an extension to H-learning, which automatically explores the unexplored parts of the state space, while always choosing greedy actions with respect to the current value function. We show that this "Auto-exploratory H-learning" performs better than the original H-learning under previously studied exploration methods such as random, recency-based, or counter-based exploration.