Markov Decision Processes: Discrete Stochastic Dynamic Programming
Markov Decision Processes: Discrete Stochastic Dynamic Programming
Introduction to Reinforcement Learning
Introduction to Reinforcement Learning
Knows what it knows: a framework for self-aware learning
Proceedings of the 25th international conference on Machine learning
Efficient structure learning in factored-state MDPs
AAAI'07 Proceedings of the 22nd national conference on Artificial intelligence - Volume 1
Hierarchical reinforcement learning with the MAXQ value function decomposition
Journal of Artificial Intelligence Research
Generalizing plans to new environments in relational MDPs
IJCAI'03 Proceedings of the 18th international joint conference on Artificial intelligence
Exploring compact reinforcement-learning representations with linear regression
UAI '09 Proceedings of the Twenty-Fifth Conference on Uncertainty in Artificial Intelligence
Evolving Static Representations for Task Transfer
The Journal of Machine Learning Research
HyperNEAT-GGP: a hyperNEAT-based atari general game player
Proceedings of the 14th annual conference on Genetic and evolutionary computation
An extension of a hierarchical reinforcement learning algorithm for multiagent settings
EWRL'11 Proceedings of the 9th European conference on Recent Advances in Reinforcement Learning
Learning to interpret natural language instructions
SIAC '12 Proceedings of the Second Workshop on Semantic Interpretation in an Actionable Context
Object focused q-learning for autonomous agents
Proceedings of the 2013 international conference on Autonomous agents and multi-agent systems
Exploration in relational domains for model-based reinforcement learning
The Journal of Machine Learning Research
The arcade learning environment: an evaluation platform for general agents
Journal of Artificial Intelligence Research
Hi-index | 0.02 |
Rich representations in reinforcement learning have been studied for the purpose of enabling generalization and making learning feasible in large state spaces. We introduce Object-Oriented MDPs (OO-MDPs), a representation based on objects and their interactions, which is a natural way of modeling environments and offers important generalization opportunities. We introduce a learning algorithm for deterministic OO-MDPs and prove a polynomial bound on its sample complexity. We illustrate the performance gains of our representation and algorithm in the well-known Taxi domain, plus a real-life videogame.