Model-based average reward reinforcement learning
Artificial Intelligence
Computers and Operations Research - Neural networks in business
Markov Decision Processes: Discrete Stochastic Dynamic Programming
Markov Decision Processes: Discrete Stochastic Dynamic Programming
Introduction to Reinforcement Learning
Introduction to Reinforcement Learning
Coordinated Reinforcement Learning
ICML '02 Proceedings of the Nineteenth International Conference on Machine Learning
A Rollout Policy for the Vehicle Routing Problem with Stochastic Demands
Operations Research
Learning to Communicate and Act Using Hierarchical Reinforcement Learning
AAMAS '04 Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems - Volume 3
Solving multiagent assignment Markov decision processes
Proceedings of The 8th International Conference on Autonomous Agents and Multiagent Systems - Volume 1
Transfer learning via relational templates
ILP'09 Proceedings of the 19th international conference on Inductive logic programming
Modeling difference rewards for multiagent learning
Proceedings of the 11th International Conference on Autonomous Agents and Multiagent Systems - Volume 3
Hi-index | 0.00 |
Reinforcement learning in real-world domains suffers from three curses of dimensionality: explosions in state and action spaces, and high stochasticity. We present approaches that mitigate each of these curses. To handle the state-space explosion, we introduce “tabular linear functions” that generalize tile-coding and linear value functions. Action space complexity is reduced by replacing complete joint action space search with a form of hill climbing. To deal with high stochasticity, we introduce a new algorithm called ASH-learning, which is an afterstate version of H-Learning. Our extensions make it practical to apply reinforcement learning to a domain of product delivery – an optimization problem that combines inventory control and vehicle routing.