Recent Advances in Hierarchical Reinforcement Learning

Authors:
Andrew G. Barto;Sridhar Mahadevan
Affiliations:
Autonomous Learning Laboratory, Department of Computer Science, University of Massachusetts, Amherst MA 01003 barto@cs.umass.edu;Autonomous Learning Laboratory, Department of Computer Science, University of Massachusetts, Amherst MA 01003 mahadeva@cs.umass.edu
Venue:
Discrete Event Dynamic Systems
Year:
2003

Citing 53
Cited 33

Learning to solve problems by searching for macro-operators

Learning to solve problems by searching for macro-operators
Dynamic programming: deterministic and stochastic models

Dynamic programming: deterministic and stochastic models
Building and understanding adaptive systems: a statistical/numerical approach to factory automation and brain research

IEEE Transactions on Systems, Man and Cybernetics
Statecharts: A visual formalism for complex systems

Science of Computer Programming
A model for reasoning about persistence and causation

Computational Intelligence
Practical Issues in Temporal Difference Learning

Machine Learning
Technical Note: \cal Q-Learning

Machine Learning
TD-Gammon, a self-teaching backgammon program, achieves master-level play

Neural Computation
Learning to act using real-time dynamic programming

Artificial Intelligence - Special volume on computational research on interaction and agency, part 1
Average reward reinforcement learning: foundations, algorithms, and empirical results

Machine Learning - Special issue on reinforcement learning
Planning and acting in partially observable stochastic domains

Artificial Intelligence
Xavier: a robot navigation architecture based on partially observable Markov decision process models

Artificial intelligence and mobile robots
Learning hierarchical control structures for multiple tasks and changing environments

Proceedings of the fifth international conference on simulation of adaptive behavior on From animals to animats 5
Reinforcement learning with hierarchies of machines

NIPS '97 Proceedings of the 1997 conference on Advances in neural information processing systems 10
Multi-time models for temporally abstract planning

NIPS '97 Proceedings of the 1997 conference on Advances in neural information processing systems 10
Learning to Improve Coordinated Actions in Cooperative Distributed Problem-Solving Environments

Machine Learning
Elevator Group Control Using Multiple Reinforcement Learning Agents

Machine Learning
Between MDPs and semi-MDPs: a framework for temporal abstraction in reinforcement learning

Artificial Intelligence
The Hierarchical Hidden Markov Model: Analysis and Applications

Machine Learning
Convergence Results for Single-Step On-PolicyReinforcement-Learning Algorithms

Machine Learning
Transition network grammars for natural language analysis

Communications of the ACM
Hierarchical multi-agent reinforcement learning

Proceedings of the fifth international conference on Autonomous agents
Multiagent Systems: A Modern Approach to Distributed Artificial Intelligence

Multiagent Systems: A Modern Approach to Distributed Artificial Intelligence
Introduction to Stochastic Dynamic Programming: Probability and Mathematical

Introduction to Stochastic Dynamic Programming: Probability and Mathematical
Introduction to Reinforcement Learning

Introduction to Reinforcement Learning
Neuro-Dynamic Programming

Neuro-Dynamic Programming
Singular Perturbation Methods in Control: Analysis and Design

Singular Perturbation Methods in Control: Analysis and Design
A Heuristic Approach to the Discovery of Macro-Operators

Machine Learning
Theoretical Results on Reinforcement Learning with Temporally Abstract Options

ECML '98 Proceedings of the 10th European Conference on Machine Learning
Scaling Reinforcement Learning toward RoboCup Soccer

ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Automatic Discovery of Subgoals in Reinforcement Learning using Diverse Density

ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Hierarchically Optimal Average Reward Reinforcement Learning

ICML '02 Proceedings of the Nineteenth International Conference on Machine Learning
Discovering Hierarchy in Reinforcement Learning with HEXQ

ICML '02 Proceedings of the Nineteenth International Conference on Machine Learning
Integrating Experimentation and Guidance in Relational Reinforcement Learning

ICML '02 Proceedings of the Nineteenth International Conference on Machine Learning
Scaling Reinforcement Learning Algorithms by Learning Variable Temporal Resolution Models

ML '92 Proceedings of the Ninth International Workshop on Machine Learning
Continuous-Time Hierarchical Reinforcement Learning

ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Lyapunov-Constrained Action Sets for Reinforcement Learning

ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
The Complexity of Decentralized Control of Markov Decision Processes

UAI '00 Proceedings of the 16th Conference on Uncertainty in Artificial Intelligence
Decision-Theoretic Planning with Concurrent Temporally Extended Actions

UAI '01 Proceedings of the 17th Conference in Uncertainty in Artificial Intelligence
Localizing Search in Reinforcement Learning

Proceedings of the Seventeenth National Conference on Artificial Intelligence and Twelfth Conference on Innovative Applications of Artificial Intelligence
Solving Semi-Markov Decision Problems Using Average Reward Reinforcement Learning

Management Science
Reinforcement learning with selective perception and hidden state

Reinforcement learning with selective perception and hidden state
Large-scale dynamic optimization using teams of reinforcement learning agents

Large-scale dynamic optimization using teams of reinforcement learning agents
Hierarchical control and learning for markov decision processes

Hierarchical control and learning for markov decision processes
Temporal abstraction in reinforcement learning

Temporal abstraction in reinforcement learning
Autonomous discovery of temporal abstractions from interaction with an environment

Autonomous discovery of temporal abstractions from interaction with an environment
Hierarchical learning and planning in partially observable markov decision processes

Hierarchical learning and planning in partially observable markov decision processes
Lyapunov design for safe reinforcement learning

The Journal of Machine Learning Research
Hierarchical reinforcement learning with the MAXQ value function decomposition

Journal of Artificial Intelligence Research
Learning topological maps with weak local odometric information

IJCAI'97 Proceedings of the Fifteenth international joint conference on Artifical intelligence - Volume 2
Reinforcement learning: a survey

Journal of Artificial Intelligence Research
Tractable inference for complex stochastic processes

UAI'98 Proceedings of the Fourteenth conference on Uncertainty in artificial intelligence
A time aggregation approach to Markov decision processes

Automatica (Journal of IFAC)

Evolving neural network ensembles for control problems

GECCO '05 Proceedings of the 7th annual conference on Genetic and evolutionary computation
Predictive state representations with options

ICML '06 Proceedings of the 23rd international conference on Machine learning
A layered approach to learning coordination knowledge in multiagent environments

Applied Intelligence
Model-Based Reinforcement Learning for Partially Observable Games with Sampling-Based State Estimation

Neural Computation
Anticipations, Brains, Individual and Social Behavior: An Introduction to Anticipatory Systems

Anticipatory Behavior in Adaptive Learning Systems
Subgoal Identification for Reinforcement Learning and Planning in Multiagent Problem Solving

MATES '07 Proceedings of the 5th German conference on Multiagent System Technologies
Dynamic Abstraction for Hierarchical Problem Solving and Execution in Stochastic Dynamic Environments

Proceedings of the 2006 conference on STAIRS 2006: Proceedings of the Third Starting AI Researchers' Symposium
Towards Automatic Model Generation by Optimization

Proceedings of the 2008 conference on Tenth Scandinavian Conference on Artificial Intelligence: SCAI 2008
Learning by Automatic Option Discovery from Conditionally Terminating Sequences

Proceedings of the 2006 conference on ECAI 2006: 17th European Conference on Artificial Intelligence August 29 -- September 1, 2006, Riva del Garda, Italy
A linear-complexity reparameterisation strategy for the hierarchical bootstrapping of capabilities within perception-action architectures

Image and Vision Computing
Speeding up learning in real-time search via automatic state abstraction

AAAI'05 Proceedings of the 20th national conference on Artificial intelligence - Volume 3
Behavior bounding: an efficient method for high-level behavior comparison

Journal of Artificial Intelligence Research
State similarity based approach for improving performance in RL

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Learning to generalize and reuse skills using approximate partial policy homomorphisms

SMC'09 Proceedings of the 2009 IEEE international conference on Systems, Man and Cybernetics
Optimal policy switching algorithms for reinforcement learning

Proceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems: volume 1 - Volume 1
Extending BDI plan selection to incorporate learning from experience

Robotics and Autonomous Systems
Combining reinforcement learning with symbolic planning

ALAMAS'05/ALAMAS'06/ALAMAS'07 Proceedings of the 5th , 6th and 7th European conference on Adaptive and learning agents and multi-agent systems: adaptation and multi-agent learning
Multi-agent reinforcement learning for intrusion detection

ALAMAS'05/ALAMAS'06/ALAMAS'07 Proceedings of the 5th , 6th and 7th European conference on Adaptive and learning agents and multi-agent systems: adaptation and multi-agent learning
Hierarchical behaviours: getting the most bang for your bit

ECAL'09 Proceedings of the 10th European conference on Advances in artificial life: Darwin meets von Neumann - Volume Part II
Review: learning like a baby: A survey of artificial intelligence approaches

The Knowledge Engineering Review
Incremental skill acquisition for self-motivated learning animats

SAB'06 Proceedings of the 9th international conference on From Animals to Animats: simulation of Adaptive Behavior
A model of reaching that integrates reinforcement learning and population encoding of postures

SAB'06 Proceedings of the 9th international conference on From Animals to Animats: simulation of Adaptive Behavior
Mosaic for multiple-reward environments

Neural Computation
Abstraction and generalization in reinforcement learning: a summary and framework

ALA'09 Proceedings of the Second international conference on Adaptive and Learning Agents
Effectiveness of considering state similarity for reinforcement learning

IDEAL'06 Proceedings of the 7th international conference on Intelligent Data Engineering and Automated Learning
Reinforcement learning to adjust robot movements to new situations

IJCAI'11 Proceedings of the Twenty-Second international joint conference on Artificial Intelligence - Volume Volume Three
Automatic construction of temporally extended actions for MDPs using bisimulation metrics

EWRL'11 Proceedings of the 9th European conference on Recent Advances in Reinforcement Learning
Learning to win by reading manuals in a monte-carlo framework

Journal of Artificial Intelligence Research
Abstraction in Model Based Partially Observable Reinforcement Learning Using Extended Sequence Trees

WI-IAT '12 Proceedings of the The 2012 IEEE/WIC/ACM International Joint Conferences on Web Intelligence and Intelligent Agent Technology - Volume 02
A generic adaptive simulation algorithm for component-based simulation systems

Proceedings of the 2013 ACM SIGSIM conference on Principles of advanced discrete simulation
Reinforcement learning in robotics: A survey

International Journal of Robotics Research
Hierarchical Social Network Analysis Using a Multi-Agent System: A School System Case

International Journal of Agent Technologies and Systems
Automatic skill acquisition in reinforcement learning using graph centrality measures

Intelligent Data Analysis

Quantified Score

Hi-index	0.00

Visualization

Abstract

Reinforcement learning is bedeviled by the curse of dimensionality: the number of parameters to be learned grows exponentially with the size of any compact encoding of a state. Recent attempts to combat the curse of dimensionality have turned to principled ways of exploiting temporal abstraction, where decisions are not required at each step, but rather invoke the execution of temporally-extended activities which follow their own policies until termination. This leads naturally to hierarchical control architectures and associated learning algorithms. We review several approaches to temporal abstraction and hierarchical organization that machine learning researchers have recently developed. Common to these approaches is a reliance on the theory of semi-Markov decision processes, which we emphasize in our review. We then discuss extensions of these ideas to concurrent activities, multiagent coordination, and hierarchical memory for addressing partial observability. Concluding remarks address open challenges facing the further development of reinforcement learning in a hierarchical setting.