Exploration bonuses and dual control

Authors:
Peter Dayan;Terrence J. Sejnowski
Affiliations:
-;-
Venue:
Machine Learning
Year:
1996

Citing 0
Cited 13

Convergence Results for Single-Step On-PolicyReinforcement-Learning Algorithms

Machine Learning
Dopamine: generalization and bonuses

Neural Networks - Computational models of neuromodulation
Control of exploitation-exploration meta-parameter in reinforcement learning

Neural Networks - Computational models of neuromodulation
Exploration Strategies for Model-based Learning in Multi-agent Systems: Exploration Strategies

Autonomous Agents and Multi-Agent Systems
Biasing Exploration in an Anticipatory Learning Classifier System

IWLCS '01 Revised Papers from the 4th International Workshop on Advances in Learning Classifier Systems
Hidden-Mode Markov Decision Processes for Nonstationary Sequential Decision Making

Sequence Learning - Paradigms, Algorithms, and Applications
Sequential Decision Making Based on Direct Search

Sequence Learning - Paradigms, Algorithms, and Applications
Reliability of internal prediction/estimation and its application: I. adaptive action selection reflecting reliability of value function

Neural Networks
The Two Facets of the Exploration-Exploitation Dilemma

IAT '06 Proceedings of the IEEE/WIC/ACM international conference on Intelligent Agent Technology
Model-based reinforcement learning: a computational model and an fMRI study

Neurocomputing
Robots that learn language: developmental approach to human-machine conversations

EELC'06 Proceedings of the Third international conference on Emergence and Evolution of Linguistic Communication: symbol Grounding and Beyond
Phasic dopamine as a prediction error of intrinsic and extrinsic reinforcements driving both action acquisition and reward maximization: A simulated robotic study

Neural Networks
Scalable and efficient bayes-adaptive reinforcement learning based on monte-carlo tree search

Journal of Artificial Intelligence Research

Quantified Score

Hi-index	0.00

Visualization

Abstract