Tree-Based Batch Mode Reinforcement Learning

Authors:
Damien Ernst;Pierre Geurts;Louis Wehenkel
Affiliations:
-;-;-
Venue:
The Journal of Machine Learning Research
Year:
2005

Citing 0
Cited 62

Batch reinforcement learning in a complex domain

Proceedings of the 6th international joint conference on Autonomous agents and multiagent systems
Learning near-optimal policies with Bellman-residual minimization based fitted policy iteration and a single sample path

Machine Learning
Workstation capacity tuning using reinforcement learning

Proceedings of the 2007 ACM/IEEE conference on Supercomputing
Non-parametric policy gradients: a unified treatment of propositional and relational domains

Proceedings of the 25th international conference on Machine learning
Transfer of samples in batch reinforcement learning

Proceedings of the 25th international conference on Machine learning
Finite-Time Bounds for Fitted Value Iteration

The Journal of Machine Learning Research
Imitative Reinforcement Learning for Soccer Playing Robots

RoboCup 2006: Robot Soccer World Cup X
An Analysis of Case-Based Value Function Approximation by Approximating State Transition Graphs

ICCBR '07 Proceedings of the 7th international conference on Case-Based Reasoning: Case-Based Reasoning Research and Development
Fitted Natural Actor-Critic: A New Algorithm for Continuous State-Action MDPs

ECML PKDD '08 Proceedings of the European conference on Machine Learning and Knowledge Discovery in Databases - Part II
Lazy Planning under Uncertainty by Optimizing Decisions on an Ensemble of Incomplete Disturbance Trees

Recent Advances in Reinforcement Learning
Regularized Fitted Q-Iteration: Application to Planning

Recent Advances in Reinforcement Learning
Evaluation of Batch-Mode Reinforcement Learning Methods for Solving DEC-MDPs with Changing Action Sets

Recent Advances in Reinforcement Learning
Gaussian process dynamic programming

Neurocomputing
Learning complex motions by sequencing simpler motion templates

ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Binary action search for learning continuous-action control policies

ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Improving Batch Reinforcement Learning Performance through Transfer of Samples

Proceedings of the 2008 conference on STAIRS 2008: Proceedings of the Fourth Starting AI Researchers' Symposium
A Simulation-based Approach for Solving Generalized Semi-Markov Decision Processes

Proceedings of the 2008 conference on ECAI 2008: 18th European Conference on Artificial Intelligence
Reinforcement learning for robot soccer

Autonomous Robots
Decision tree methods for finding reusable MDP homomorphisms

AAAI'06 Proceedings of the 21st national conference on Artificial intelligence - Volume 1
Robust task-based control policies for physics-based characters

ACM SIGGRAPH Asia 2009 papers
Adaptive treatment of epilepsy via batch-mode reinforcement learning

IAAI'08 Proceedings of the 20th national conference on Innovative applications of artificial intelligence - Volume 3
Closed-loop learning of visual control policies

Journal of Artificial Intelligence Research
Real-time planning for parameterized human motion

Proceedings of the 2008 ACM SIGGRAPH/Eurographics Symposium on Computer Animation
Reinforcement learning versus model predictive control: a comparison on a power system problem

IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics
Regularized fitted Q-iteration for planning in continuous-space Markovian decision problems

ACC'09 Proceedings of the 2009 conference on American Control Conference
Model-based and model-free reinforcement learning for visual servoing

ICRA'09 Proceedings of the 2009 IEEE international conference on Robotics and Automation
Transfer Learning for Reinforcement Learning Domains: A Survey

The Journal of Machine Learning Research
Approximate dynamic programming with a fuzzy parameterization

Automatica (Journal of IFAC)
Bounds for multistage stochastic programs using supervised learning strategies

SAGA'09 Proceedings of the 5th international conference on Stochastic algorithms: foundations and applications
Motion fields for interactive character locomotion

ACM SIGGRAPH Asia 2010 papers
Gaussian processes for sample efficient reinforcement learning with RMAX-like exploration

ECML PKDD'10 Proceedings of the 2010 European conference on Machine learning and knowledge discovery in databases: Part I
Continuous-state reinforcement learning with fuzzy approximation

ALAMAS'05/ALAMAS'06/ALAMAS'07 Proceedings of the 5th , 6th and 7th European conference on Adaptive and learning agents and multi-agent systems: adaptation and multi-agent learning
Sparse approximate dynamic programming for dialog management

SIGDIAL '10 Proceedings of the 11th Annual Meeting of the Special Interest Group on Discourse and Dialogue
Empowerment for continuous agent-environment systems

Adaptive Behavior - Animals, Animats, Software Agents, Robots, Adaptive Systems
Hessian matrix distribution for Bayesian policy gradient reinforcement learning

Information Sciences: an International Journal
Sample-efficient batch reinforcement learning for dialogue management optimization

ACM Transactions on Speech and Language Processing (TSLP)
Exploiting Best-Match Equations for Efficient Reinforcement Learning

The Journal of Machine Learning Research
Approximate policy iteration for closed-loop learning of visual tasks

ECML'06 Proceedings of the 17th European conference on Machine Learning
Task-Driven discretization of the joint space of visual percepts and continuous actions

ECML'06 Proceedings of the 17th European conference on Machine Learning
Learning near-optimal policies with bellman-residual minimization based fitted policy iteration and a single sample path

COLT'06 Proceedings of the 19th annual conference on Learning Theory
A hybrid learning strategy for discovery of policies of action

IBERAMIA-SBIA'06 Proceedings of the 2nd international joint conference, and Proceedings of the 10th Ibero-American Conference on AI 18th Brazilian conference on Advances in Artificial Intelligence
Neural fitted q iteration – first experiences with a data efficient neural reinforcement learning method

ECML'05 Proceedings of the 16th European conference on Machine Learning
Reinforcement learning with raw image pixels as input state

IWICPAS'06 Proceedings of the 2006 Advances in Machine Vision, Image Processing, and Pattern Analysis international conference on Intelligent Computing in Pattern Analysis/Synthesis
Abstraction and generalization in reinforcement learning: a summary and framework

ALA'09 Proceedings of the Second international conference on Adaptive and Learning Agents
Data-driven dynamic emulation modelling for the optimal management of environmental systems

Environmental Modelling & Software
Sequential feature selection for classification

AI'11 Proceedings of the 24th international conference on Advances in Artificial Intelligence
Reinforcement Programming

Computational Intelligence
Cognitive concepts in autonomous soccer playing robots

Cognitive Systems Research
Goal-Directed online learning of predictive models

EWRL'11 Proceedings of the 9th European conference on Recent Advances in Reinforcement Learning
Reinforcement learning with a bilinear q function

EWRL'11 Proceedings of the 9th European conference on Recent Advances in Reinforcement Learning
Towards a Multiple-Lookahead-Levels agent reinforcement-learning technique and its implementation in integrated circuits

The Journal of Supercomputing
Q-Tree: automatic construction of hierarchical state representation for reinforcement learning

ICIRA'12 Proceedings of the 5th international conference on Intelligent Robotics and Applications - Volume Part III
Policy iteration based on a learned transition model

ECML PKDD'12 Proceedings of the 2012 European conference on Machine Learning and Knowledge Discovery in Databases - Volume Part II
Learning motion controllers with adaptive depth perception

EUROSCA'12 Proceedings of the 11th ACM SIGGRAPH / Eurographics conference on Computer Animation
Learning motion controllers with adaptive depth perception

Proceedings of the ACM SIGGRAPH/Eurographics Symposium on Computer Animation
TEXPLORE: real-time sample-efficient reinforcement learning for robots

Machine Learning
Machine learning for interactive systems and robots: a brief introduction

Proceedings of the 2nd Workshop on Machine Learning for Interactive Systems: Bridging the Gap Between Perception, Action and Communication
Dynamic policy programming

The Journal of Machine Learning Research
Linear Bayesian reinforcement learning

IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence
Employing batch reinforcement learning to control gene regulation without explicitly constructing gene regulatory networks

IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence
Reinforcement learning algorithms with function approximation: Recent advances and applications

Information Sciences: an International Journal
A survey of multi-objective sequential decision-making

Journal of Artificial Intelligence Research

Quantified Score

Hi-index	0.00

Visualization

Abstract

Reinforcement learning aims to determine an optimal control policy from interaction with a system or from observations gathered from a system. In batch mode, it can be achieved by approximating the so-called Q-function based on a set of four-tuples (xt, ut , rt, xt+1) where xt denotes the system state at time t, ut the control action taken, rt the instantaneous reward obtained and xt+1 the successor state of the system, and by determining the control policy from this Q-function. The Q-function approximation may be obtained from the limit of a sequence of (batch mode) supervised learning problems. Within this framework we describe the use of several classical tree-based supervised learning methods (CART, Kd-tree, tree bagging) and two newly proposed ensemble algorithms, namely extremely and totally randomized trees. We study their performances on several examples and find that the ensemble methods based on regression trees perform well in extracting relevant information about the optimal control policy from sets of four-tuples. In particular, the totally randomized trees give good results while ensuring the convergence of the sequence, whereas by relaxing the convergence constraint even better accuracy results are provided by the extremely randomized trees.