A distributed Q-learning approach for variable attention to multiple critics

Authors:
Maryam Tavakol;Majid Nili Ahmadabadi;Maryam Mirian;Masoud Asadpour
Affiliations:
Cognitive Robotics Lab, Control and Intelligent Processing Center of Excellence, School of ECE., College of Eng., Univ. of Tehran, Iran;Cognitive Robotics Lab, Control and Intelligent Processing Center of Excellence, School of ECE., College of Eng., Univ. of Tehran, Iran,School of Cognitive Sciences, Institute for Research in Fund ...;Cognitive Robotics Lab, Control and Intelligent Processing Center of Excellence, School of ECE., College of Eng., Univ. of Tehran, Iran;Cognitive Robotics Lab, Control and Intelligent Processing Center of Excellence, School of ECE., College of Eng., Univ. of Tehran, Iran
Venue:
ICONIP'12 Proceedings of the 19th international conference on Neural Information Processing - Volume Part III
Year:
2012

Citing 4
Cited 0

Introduction to Reinforcement Learning

Introduction to Reinforcement Learning
Inter-module credit assignment in modular reinforcement learning

Neural Networks
On the difficulty of modular reinforcement learning for real-world partial programming

AAAI'06 Proceedings of the 21st national conference on Artificial intelligence - Volume 1
Multiple-goal reinforcement learning with modular Sarsa(O)

IJCAI'03 Proceedings of the 18th international joint conference on Artificial intelligence

Quantified Score

Hi-index	0.00

Visualization

Abstract

One of the substantial concerns of researchers in machine learning area is designing an artificial agent with an autonomous behaviour in a complex environment. In this paper, we considered a learning problem with multiple critics. The importance of each critic for the agent is different, and attention of agent to them is variable during its life. Inspired from neurological studies, we proposed a distributed learning approach for this problem that is flexible against the variable attention. In this approach, there is a distinct learner for each critic that an algorithm is introduced for aggregating of their knowledge based on combination of model-free and model-based learning methods. We showed that this aggregation method could provide the optimal policy for this problem.