Q-Cut - Dynamic Discovery of Sub-goals in Reinforcement Learning

Authors:
Ishai Menache;Shie Mannor;Nahum Shimkin
Affiliations:
-;-;-
Venue:
ECML '02 Proceedings of the 13th European Conference on Machine Learning
Year:
2002

Citing 14
Cited 28

A new approach to the maximum-flow problem

Journal of the ACM (JACM)
Network flows: theory, algorithms, and applications

Network flows: theory, algorithms, and applications
Technical Note: \cal Q-Learning

Machine Learning
Self-Improving Reactive Agents Based on Reinforcement Learning, Planning and Teaching

Machine Learning
HQ-learning

Adaptive Behavior
Learning hierarchical control structures for multiple tasks and changing environments

Proceedings of the fifth international conference on simulation of adaptive behavior on From animals to animats 5
Between MDPs and semi-MDPs: a framework for temporal abstraction in reinforcement learning

Artificial Intelligence
Neuro-Dynamic Programming

Neuro-Dynamic Programming
Automatic Discovery of Subgoals in Reinforcement Learning using Diverse Density

ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Acquisition of Stand-up Behavior by a Real Robot using Hierarchical Reinforcement Learning

ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Feudal Reinforcement Learning

Advances in Neural Information Processing Systems 5, [NIPS Conference]
Learning from Labeled and Unlabeled Data using Graph Mincuts

ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
When clusters meet partitions: new density-based methods for circuit decomposition

EDTC '95 Proceedings of the 1995 European conference on Design and Test
Hierarchical reinforcement learning with the MAXQ value function decomposition

Journal of Artificial Intelligence Research

Using relative novelty to identify useful temporal abstractions in reinforcement learning

ICML '04 Proceedings of the twenty-first international conference on Machine learning
Dynamic abstraction in reinforcement learning via clustering

ICML '04 Proceedings of the twenty-first international conference on Machine learning
A causal approach to hierarchical decomposition of factored MDPs

ICML '05 Proceedings of the 22nd international conference on Machine learning
Proto-value functions: developmental reinforcement learning

ICML '05 Proceedings of the 22nd international conference on Machine learning
Identifying useful subgoals in reinforcement learning by local graph partitioning

ICML '05 Proceedings of the 22nd international conference on Machine learning
Causal Graph Based Decomposition of Factored MDPs

The Journal of Machine Learning Research
A layered approach to learning coordination knowledge in multiagent environments

Applied Intelligence
Automatic discovery and transfer of MAXQ hierarchies

Proceedings of the 25th international conference on Machine learning
Subgoal Identification for Reinforcement Learning and Planning in Multiagent Problem Solving

MATES '07 Proceedings of the 5th German conference on Multiagent System Technologies
Using Strongly Connected Components as a Basis for Autonomous Skill Acquisition in Reinforcement Learning

ISNN '09 Proceedings of the 6th International Symposium on Neural Networks on Advances in Neural Networks
Learning by Automatic Option Discovery from Conditionally Terminating Sequences

Proceedings of the 2006 conference on ECAI 2006: 17th European Conference on Artificial Intelligence August 29 -- September 1, 2006, Riva del Garda, Italy
Towards competence in autonomous agents

AAAI'05 Proceedings of the 20th national conference on Artificial intelligence - Volume 4
State similarity based approach for improving performance in RL

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Automatic abstraction in reinforcement learning using data mining techniques

Robotics and Autonomous Systems
Constructing action set from basis functions for reinforcement learning of robot control

ICRA'09 Proceedings of the 2009 IEEE international conference on Robotics and Automation
Automatic discovery of subgoals in reinforcement learning using strongly connected components

ICONIP'08 Proceedings of the 15th international conference on Advances in neuro-information processing - Volume Part I
Optimal policy switching algorithms for reinforcement learning

Proceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems: volume 1 - Volume 1
Autonomous discovery of subgoals using acyclic state trajectories

ICICA'10 Proceedings of the First international conference on Information computing and applications
Automatic discovery of subgoals based on improved FCM clustering

AICI'11 Proceedings of the Third international conference on Artificial intelligence and computational intelligence - Volume Part II
Network flow for collaborative ranking

PKDD'06 Proceedings of the 10th European conference on Principle and Practice of Knowledge Discovery in Databases
Learning skills in reinforcement learning using relative novelty

SARA'05 Proceedings of the 6th international conference on Abstraction, Reformulation and Approximation
Incremental skill acquisition for self-motivated learning animats

SAB'06 Proceedings of the 9th international conference on From Animals to Animats: simulation of Adaptive Behavior
Effectiveness of considering state similarity for reinforcement learning

IDEAL'06 Proceedings of the 7th international conference on Intelligent Data Engineering and Automated Learning
Unified inter and intra options learning using policy gradient methods

EWRL'11 Proceedings of the 9th European conference on Recent Advances in Reinforcement Learning
Learning in a small world

Proceedings of the 11th International Conference on Autonomous Agents and Multiagent Systems - Volume 1
Abstraction in Model Based Partially Observable Reinforcement Learning Using Extended Sequence Trees

WI-IAT '12 Proceedings of the The 2012 IEEE/WIC/ACM International Joint Conferences on Web Intelligence and Intelligent Agent Technology - Volume 02
DCOB: Action space for reinforcement learning of high DoF robots

Autonomous Robots
Automatic skill acquisition in reinforcement learning using graph centrality measures

Intelligent Data Analysis

Quantified Score

Hi-index	0.00

Visualization

Abstract

We present the Q-Cut algorithm, a graph theoretic approach for automatic detection of sub-goals in a dynamic environment, which is used for acceleration of the Q-Learning algorithm. The learning agent creates an on-line map of the process history, and uses an efficient Max-Flow/Min-Cut algorithm for identifying bottlenecks. The policies for reaching bottlenecks are separately learned and added to the model in a form of options (macro-actions). We then extend the basic Q-Cut algorithm to the Segmented Q-Cut algorithm, which uses previously identified bottlenecks for state space partitioning, necessary for finding additional bottlenecks in complex environments. Experiments showsign ificant performance improvements, particulary in the initial learning phase.