A drive-reinforcement model of single neuron function: An alternative to the Hebbian neuronal model
AIP Conference Proceedings 151 on Neural Networks for Computing
AIP Conference Proceedings 151 on Neural Networks for Computing
Proceedings of the second international conference on From animals to animats 2 : simulation of adaptive behavior: simulation of adaptive behavior
TD(λ) Converges with Probability 1
Machine Learning
Neural Networks - Special issue on neural control and robotics: biology and technology
Learning to Predict by the Methods of Temporal Differences
Machine Learning
Temporal Difference Model Reproduces Anticipatory Neural Activity
Neural Computation
MOSAIC Model for Sensorimotor Learning and Control
Neural Computation
Spike-Timing-Dependent Hebbian Plasticity as Temporal Difference Learning
Neural Computation
Reinforcement Learning in Continuous Time and Space
Neural Computation
The hippocampus and cerebellum in adaptively timed learning, recognition, and movement
Journal of Cognitive Neuroscience
A Reflexive Neural Network for Dynamic Biped Walking Control
Neural Computation
IWINAC '07 Proceedings of the 2nd international work-conference on Nature Inspired Problem-Solving Methods in Knowledge Engineering: Interplay Between Natural and Artificial Computation, Part II
Second Order Conditioning in the Sub-cortical Nuclei of the Limbic System
SAB '08 Proceedings of the 10th international conference on Simulation of Adaptive Behavior: From Animals to Animats
A spiking neural network model of an actor-critic learning agent
Neural Computation
Robotics and Autonomous Systems
Learning and Reversal Learning in the Subcortical Limbic System: A Computational Model
Adaptive Behavior - Animals, Animats, Software Agents, Robots, Adaptive Systems
Robotics and Autonomous Systems
Stabilising hebbian learning with a third factor in a food retrieval task
SAB'06 Proceedings of the 9th international conference on From Animals to Animats: simulation of Adaptive Behavior
Novel method for using Q-learning in small microcontrollers
Proceedings of the 51st ACM Southeast Conference
Hi-index | 0.01 |
In this article, we present an isotropic unsupervised algorithm for temporal sequence learning. No special reward signal is used such that all inputs are completely isotropic. All input signals are bandpass filtered before converging onto a linear output neuron. All synaptic weights change according to the correlation of bandpass-filtered inputs with the derivative of the output. We investigate the algorithm in an open- and a closed-loop condition, the latter being defined by embedding the learning system into a behavioral feedback loop. In the open-loop condition, we find that the linear structure of the algorithm allows analytically calculating the shape of the weight change, which is strictly heterosynaptic and follows the shape of the weight change curves found in spike-time-dependent plasticity. Furthermore, we show that synaptic weights stabilize automatically when no more temporal differences exist between the inputs without additional normalizing measures. In the second part of this study, the algorithm is is placed in an environment that leads to closed sensor-motor loop. To this end, a robot is programmed with a prewired retraction reflex reaction in response to collisions. Through isotropic sequence order (ISO) learning, the robot achieves collision avoidance by learning the correlation between his early range-finder signals and the later occurring collision signal. Synaptic weights stabilize at the end of learning as theoretically predicted. Finally, we discuss the relation of ISO learning with other drive reinforcement models and with the commonly used temporal difference learning algorithm. This study is followed up by a mathematical analysis of the closed-loop situation in the companion article in this issue, "ISO Learning Approximates a Solution to the Inverse-Controller Problem in an Unsupervised Behavioral Paradigm" (pp. 865-884).