Speeding-up reinforcement learning through abstraction and transfer learning

Authors:
Marcelo Li Koga;Valdinei Freire da Silva;Fabio Gagliardi Cozman;Anna Helena Reali Costa
Affiliations:
Universidade de Sao Paulo, Sao Paulo, Brazil;Universidade de Sao Paulo, Sao Paulo, Brazil;Universidade de Sao Paulo, Sao Paulo, Brazil;Universidade de Sao Paulo, Sao Paulo, Brazil
Venue:
Proceedings of the 2013 international conference on Autonomous agents and multi-agent systems
Year:
2013

Citing 5
Cited 0

Memoryless policies: theoretical limitations and practical results

SAB94 Proceedings of the third international conference on Simulation of adaptive behavior : from animals to animats 3: from animals to animats 3
Stochastic dynamic programming with factored representations

Artificial Intelligence
Markov Decision Processes: Discrete Stochastic Dynamic Programming

Markov Decision Processes: Discrete Stochastic Dynamic Programming
Probabilistic policy reuse in a reinforcement learning agent

AAMAS '06 Proceedings of the fifth international joint conference on Autonomous agents and multiagent systems
Probabilistic Policy Reuse for inter-task transfer learning

Robotics and Autonomous Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

We are interested in the following general question: is it possible to abstract knowledge that is generated while learning the solution of a problem, so that this abstraction can accelerate the learning process? Moreover, is it possible to transfer and reuse the acquired abstract knowledge to accelerate the learning process for future similar tasks? We propose a framework for conducting simultaneously two levels of reinforcement learning, where an abstract policy is learned while learning of a concrete policy for the problem, such that both policies are refined through exploration and interaction of the agent with the environment. We explore abstraction both to accelerate the learning process for an optimal concrete policy for the current problem, and to allow the application of the generated abstract policy in learning solutions for new problems. We report experiments in a robot navigation environment that show our framework to be effective in speeding up policy construction for practical problems and in generating abstractions that can be used to accelerate learning in new similar problems.