Evolution of reinforcement learning in uncertain environments: a simple explanation for complex foraging behaviors

Authors:
Yael Niv;Daphna Joel;Isaac Meilijson;Eytan Ruppin
Affiliations:
Department of Psychology, Tel Aviv University;Department of Psychology, Tel Aviv University;School of Mathematical Sciences, Tel Aviv University;School of Computer Sciences & Sackler School of Medicine, Tel Aviv University
Venue:
Adaptive Behavior
Year:
2002

Citing 11
Cited 18

Probability

Probability
Learning and evolution in neural networks

Adaptive Behavior
Computational models of neuromodulation

Neural Computation
Evolutionary neurocontrollers for autonomous mobile robots

Neural Networks - Special issue on neural control and robotics: biology and technology
Evolutionary robots with on-line self-organization and behavioral fitness

Neural Networks
Machine Learning

Machine Learning
Introduction to Reinforcement Learning

Introduction to Reinforcement Learning
Evolution of Plastic Control Networks

Autonomous Robots
Learning to Predict by the Methods of Temporal Differences

Machine Learning
Using Aperiodic Reinforcement for Directed Self-Organization During Development

Advances in Neural Information Processing Systems 5, [NIPS Conference]
Evolution of homing navigation in a real mobile robot

IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics

Actor-critic models of the basal ganglia: new anatomical and computational perspectives

Neural Networks - Computational models of neuromodulation
The Cyber Rodent Project: Exploration of Adaptive Mechanisms for Self-Preservation and Self-Reproduction

Adaptive Behavior - Animals, Animats, Software Agents, Robots, Adaptive Systems
Self-organization for search in peer-to-peer networks: the exploitation-exploration dilemma

Proceedings of the 1st international conference on Bio inspired models of network, information and computing systems
A spiking neural network model of an actor-critic learning agent

Neural Computation
Co-evolution of Rewards and Meta-parameters in Embodied Evolution

Creating Brain-Like Intelligence
2009 Special Issue: Goal-directed control and its antipodes

Neural Networks
How novelty search escapes the deceptive trap of learning to learn

Proceedings of the 11th Annual conference on Genetic and evolutionary computation
Novelty of behaviour as a basis for the neuro-evolution of operant reward learning

Proceedings of the 11th Annual conference on Genetic and evolutionary computation
Simple artificial neural networks that match probability and exploit and explore when confronting a multiarmed bandit

IEEE Transactions on Neural Networks
Evolution of neural organization in a hydra-like animat

ICONIP'08 Proceedings of the 15th international conference on Advances in neuro-information processing - Volume Part I
The neuronal replicator hypothesis

Neural Computation
Indirectly encoding neural plasticity as a pattern of local rules

SAB'10 Proceedings of the 11th international conference on Simulation of adaptive behavior: from animals to animats
Evolving plastic neural networks with novelty search

Adaptive Behavior - Animals, Animats, Software Agents, Robots, Adaptive Systems
Darwinian embodied evolution of the learning ability for survival

Adaptive Behavior - Animals, Animats, Software Agents, Robots, Adaptive Systems
On the relationships between synaptic plasticity and generative systems

Proceedings of the 13th annual conference on Genetic and evolutionary computation
From modulated Hebbian plasticity to simple behavior learning through noise and weight saturation

Neural Networks
Evolving plastic neural networks for online learning: review and future directions

AI'12 Proceedings of the 25th Australasian joint conference on Advances in Artificial Intelligence
Variable risk control via stochastic optimization

International Journal of Robotics Research

Quantified Score

Hi-index	0.00

Visualization

Abstract

Reinforcement learning is a fundamental process by which organisms learn to achieve goals from their interactions with the environment. Using evolutionary computation techniques we evolve (near-)optimal neuronal learning rules in a simple neural network model of reinforcement learning in bumblebees foraging for nectar. The resulting neural networks exhibit efficient reinforcement learning, allowing the bees to respond rapidly to changes in reward contingencies. The evolved synaptic plasticity dynamics give rise to varying exploration/exploitation levels and to the well-documented choice strategies of risk aversion and probability matching. Additionally, risk aversion is shown to emerge even when bees are evolved in a completely risk-less environment. In contrast to existing theories in economics and game theory, risk-averse behavior is shown to be a direct consequence of (near-)optimal reinforcement learning, without requiring additional assumptions such as the existence of a nonlinear subjective utility function for rewards. Our results are corroborated by a rigorous mathematical analysis, and their robustness in real-world situations is supported by experiments in a mobile robot. Thus we provide a biologically founded, parsimonious, and novel explanation for risk aversion and probability matching.