Integrating Temporal Difference Methods and Self-Organizing Neural Networks for Reinforcement Learning With Delayed Evaluative Feedback

Authors:
Ah-Hwee Tan;Ning Lu;Dan Xiao
Affiliations:
Nanyang Technol. Univ., Singapore;-;-
Venue:
IEEE Transactions on Neural Networks
Year:
2008

Citing 0
Cited 12

Intelligence Through Interaction: Towards a Unified Theory for Learning

ISNN '07 Proceedings of the 4th international symposium on Neural Networks: Advances in Neural Networks
Cognitive Agents Integrating Rules and Reinforcement Learning for Context-Aware Decision Support

WI-IAT '08 Proceedings of the 2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology - Volume 02
Scaling Up Multi-agent Reinforcement Learning in Complex Domains

WI-IAT '08 Proceedings of the 2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology - Volume 02
Integrated cognitive architectures: a survey

Artificial Intelligence Review
A self-organizing neural network architecture for intentional planning agents

Proceedings of The 8th International Conference on Autonomous Agents and Multiagent Systems - Volume 2
Q-Learning Based on Dynamical Structure Neural Network for Robot Navigation in Unknown Environment

ISNN 2009 Proceedings of the 6th International Symposium on Neural Networks: Advances in Neural Networks - Part III
A self-organizing neural architecture integrating desire, intention and reinforcement learning

Neurocomputing
Agent-augmented co-space: toward merging of real world and cyberspace

ATC'10 Proceedings of the 7th international conference on Autonomic and trusted computing
A hybrid agent architecture integrating desire, intention and reinforcement learning

Expert Systems with Applications: An International Journal
iFALCON: A neural architecture for hierarchical planning

Neurocomputing
Knowledge-Based Exploration for Reinforcement Learning in Self-Organizing Neural Networks

WI-IAT '12 Proceedings of the The 2012 IEEE/WIC/ACM International Joint Conferences on Web Intelligence and Intelligent Agent Technology - Volume 02
Bootstrapping learning from abstract models in games

International Journal of Bio-Inspired Computation

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper presents a neural architecture for learning category nodes encoding mappings across multimodal patterns involving sensory inputs, actions, and rewards. By integrating adaptive resonance theory (ART) and temporal difference (TD) methods, the proposed neural model, called TD fusion architecture for learning, cognition, and navigation (TD-FALCON), enables an autonomous agent to adapt and function in a dynamic environment with immediate as well as delayed evaluative feedback (reinforcement) signals. TD-FALCON learns the value functions of the state-action space estimated through on-policy and off-policy TD learning methods, specifically state-action-reward-state-action (SARSA) and Q-learning. The learned value functions are then used to determine the optimal actions based on an action selection policy. We have developed TD-FALCON systems using various TD learning strategies and compared their performance in terms of task completion, learning speed, as well as time and space efficiency. Experiments based on a minefield navigation task have shown that TD-FALCON systems are able to learn effectively with both immediate and delayed reinforcement and achieve a stable performance in a pace much faster than those of standard gradient-descent-based reinforcement learning systems.