An Upper Bound on the Loss from Approximate Optimal-Value Functions
Machine Learning
Between MDPs and semi-MDPs: a framework for temporal abstraction in reinforcement learning
Artificial Intelligence
Neuro-Dynamic Programming
Scaling Reinforcement Learning toward RoboCup Soccer
ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Reinforcement Learning and Shaping: Encouraging Intended Behaviors
ICML '02 Proceedings of the Nineteenth International Conference on Machine Learning
Policy Invariance Under Reward Transformations: Theory and Application to Reward Shaping
ICML '99 Proceedings of the Sixteenth International Conference on Machine Learning
ICML '99 Proceedings of the Sixteenth International Conference on Machine Learning
State abstraction for programmable reinforcement learning agents
Eighteenth national conference on Artificial intelligence
A Distributed Reinforcement Learning Scheme for Network Routing
A Distributed Reinforcement Learning Scheme for Network Routing
Autonomous shaping: knowledge transfer in reinforcement learning
ICML '06 Proceedings of the 23rd international conference on Machine learning
QUICR-learning for multi-agent coordination
AAAI'06 proceedings of the 21st national conference on Artificial intelligence - Volume 2
Hierarchical reinforcement learning with the MAXQ value function decomposition
Journal of Artificial Intelligence Research
Potential-based shaping and Q-value initialization are equivalent
Journal of Artificial Intelligence Research
Efficient solution algorithms for factored MDPs
Journal of Artificial Intelligence Research
Hierarchical solution of Markov decision processes using macro-actions
UAI'98 Proceedings of the Fourteenth conference on Uncertainty in artificial intelligence
Multigrid Reinforcement Learning with Reward Shaping
ICANN '08 Proceedings of the 18th international conference on Artificial Neural Networks, Part I
Co-evolution of Shaping Rewards and Meta-Parameters in Reinforcement Learning
Adaptive Behavior - Animals, Animats, Software Agents, Robots, Adaptive Systems
Multi-task evolutionary shaping without pre-specified representations
Proceedings of the 12th annual conference on Genetic and evolutionary computation
Dynamic reward shaping: training a robot by voice
IBERAMIA'10 Proceedings of the 12th Ibero-American conference on Advances in artificial intelligence
Theoretical considerations of potential-based reward shaping for multi-agent systems
The 10th International Conference on Autonomous Agents and Multiagent Systems - Volume 1
Multi-agent, reward shaping for RoboCup KeepAway
The 10th International Conference on Autonomous Agents and Multiagent Systems - Volume 3
Dynamic potential-based reward shaping
Proceedings of the 11th International Conference on Autonomous Agents and Multiagent Systems - Volume 1
Reinforcement Learning with Reward Shaping and Mixed Resolution Function Approximation
International Journal of Agent Technologies and Systems
Learning potential functions and their representations for multi-task reinforcement learning
Autonomous Agents and Multi-Agent Systems
Hi-index | 0.00 |
This paper investigates the problem of automatically learning how to restructure the reward function of a Markov decision process so as to speed up reinforcement learning. We begin by describing a method that learns a shaped reward function given a set of state and temporal abstractions. Next, we consider decomposition of the per-timestep reward in multieffector problems, in which the overall agent can be decomposed into multiple units that are concurrently carrying out various tasks. We show by example that to find a good reward decomposition, it is often necessary to first shape the rewards appropriately. We then give a function approximation algorithm for solving both problems together. Standard reinforcement learning algorithms can be augmented with our methods, and we show experimentally that in each case, significantly faster learning results.