Stochastic systems: estimation, identification and adaptive control
Stochastic systems: estimation, identification and adaptive control
Probability
Technical Note: \cal Q-Learning
Machine Learning
The Convergence of TD(λ) for General λ
Machine Learning
TD(λ) Converges with Probability 1
Machine Learning
Asynchronous Stochastic Approximation and Q-Learning
Machine Learning
An Upper Bound on the Loss from Approximate Optimal-Value Functions
Machine Learning
When the best move isn't optimal: Q-learning with exploration
AAAI'94 Proceedings of the twelfth national conference on Artificial intelligence (vol. 2)
Learning to act using real-time dynamic programming
Artificial Intelligence - Special volume on computational research on interaction and agency, part 1
Temporal difference learning and TD-Gammon
Communications of the ACM
Reinforcement learning with replacing eligibility traces
Machine Learning - Special issue on reinforcement learning
Exploration bonuses and dual control
Machine Learning
Dynamic Programming and Optimal Control
Dynamic Programming and Optimal Control
Markov Decision Processes: Discrete Stochastic Dynamic Programming
Markov Decision Processes: Discrete Stochastic Dynamic Programming
Introduction to Reinforcement Learning
Introduction to Reinforcement Learning
Learning to Predict by the Methods of Temporal Differences
Machine Learning
Dynamic Programming
Generalized Markov Decision Processes: Dynamic-programming and Reinforcement-learning Algorithms
Generalized Markov Decision Processes: Dynamic-programming and Reinforcement-learning Algorithms
Algorithms for Sequential Decision Making
Algorithms for Sequential Decision Making
Reinforcement learning: a survey
Journal of Artificial Intelligence Research
The dynamics of reinforcement learning in cooperative multiagent systems
AAAI '98/IAAI '98 Proceedings of the fifteenth national/tenth conference on Artificial intelligence/Innovative applications of artificial intelligence
Using background knowledge to speed reinforcement learning in physical agents
Proceedings of the fifth international conference on Autonomous agents
Multiagent learning using a variable learning rate
Artificial Intelligence
Module-Based Reinforcement Learning: Experiments with a Real Robot
Autonomous Robots
Near-Optimal Reinforcement Learning in Polynomial Time
Machine Learning
Metalearning and neuromodulation
Neural Networks - Computational models of neuromodulation
Recent Advances in Hierarchical Reinforcement Learning
Discrete Event Dynamic Systems
Exploration Strategies for Model-based Learning in Multi-agent Systems: Exploration Strategies
Autonomous Agents and Multi-Agent Systems
Evolutionary Computation
An Overview of MAXQ Hierarchical Reinforcement Learning
SARA '02 Proceedings of the 4th International Symposium on Abstraction, Reformulation, and Approximation
Reinforcement learning of coordination in cooperative multi-agent systems
Eighteenth national conference on Artificial intelligence
Recent Advances in Hierarchical Reinforcement Learning
Discrete Event Dynamic Systems
Nash q-learning for general-sum stochastic games
The Journal of Machine Learning Research
A Geometric Approach to Multi-Criterion Reinforcement Learning
The Journal of Machine Learning Research
Learning when and how to coordinate
Web Intelligence and Agent Systems
Reinforcement Learning with Factored States and Actions
The Journal of Machine Learning Research
Parallel reinforcement learning with linear function approximation
Proceedings of the 6th international joint conference on Autonomous agents and multiagent systems
Restricted gradient-descent algorithm for value-function approximation in reinforcement learning
Artificial Intelligence
The many faces of optimism: a unifying approach
Proceedings of the 25th international conference on Machine learning
Emerging coordination in infinite team Markov games
Proceedings of the 7th international joint conference on Autonomous agents and multiagent systems - Volume 1
Efficient multi-agent reinforcement learning through automated supervision
Proceedings of the 7th international joint conference on Autonomous agents and multiagent systems - Volume 3
An Optimal Approximate Dynamic Programming Algorithm for the Lagged Asset Acquisition Problem
Mathematics of Operations Research
Integrating organizational control into multi-agent learning
Proceedings of The 8th International Conference on Autonomous Agents and Multiagent Systems - Volume 2
Hierarchical reinforcement learning with the MAXQ value function decomposition
Journal of Artificial Intelligence Research
Efficient reinforcement learning using recursive least-squares methods
Journal of Artificial Intelligence Research
Existence of multiagent equilibria with limited agents
Journal of Artificial Intelligence Research
Adaptive stochastic resource control: a machine learning approach
Journal of Artificial Intelligence Research
Multiple-goal reinforcement learning with modular Sarsa(O)
IJCAI'03 Proceedings of the 18th international joint conference on Artificial intelligence
Rational and convergent learning in stochastic games
IJCAI'01 Proceedings of the 17th international joint conference on Artificial intelligence - Volume 2
A multi-agent learning approach to online distributed resource allocation
IJCAI'09 Proceedings of the 21st international jont conference on Artifical intelligence
Anytime Self-play Learning to Satisfy Functional Optimality Criteria
ADT '09 Proceedings of the 1st International Conference on Algorithmic Decision Theory
Fuzzy decision tree function approximation in reinforcement learning
International Journal of Artificial Intelligence and Soft Computing
Counter example for Q-bucket-brigade under prediction problem
IWLCS'03-05 Proceedings of the 2003-2005 international conference on Learning classifier systems
Posterior weighted reinforcement learning with state uncertainty
Neural Computation
An agent reinforcement learning model based on neural networks
LSMS'07 Proceedings of the Life system modeling and simulation 2007 international conference on Bio-Inspired computational intelligence and applications
Convergence of independent adaptive learners
EPIA'07 Proceedings of the aritficial intelligence 13th Portuguese conference on Progress in artificial intelligence
Reinforcement learning approaches to coordination in cooperative multi-agent systems
Adaptive agents and multi-agent systems
On-line learning and optimization for wireless video transmission
IEEE Transactions on Signal Processing
Automatic induction of bellman-error features for probabilistic planning
Journal of Artificial Intelligence Research
Parallel reinforcement learning with linear function approximation
ALAMAS'05/ALAMAS'06/ALAMAS'07 Proceedings of the 5th , 6th and 7th European conference on Adaptive and learning agents and multi-agent systems: adaptation and multi-agent learning
An information-spectrum approach to analysis of return maximization in reinforcement learning
ICONIP'10 Proceedings of the 17th international conference on Neural information processing: theory and algorithms - Volume Part I
Benchmarking hybrid algorithms for distributed constraint optimisation games
Autonomous Agents and Multi-Agent Systems
Exploiting Best-Match Equations for Efficient Reinforcement Learning
The Journal of Machine Learning Research
Heliza: talking dirty to the attackers
Journal in Computer Virology
The Knowledge Engineering Review
ALA'09 Proceedings of the Second international conference on Adaptive and Learning Agents
Book reviews: Self-learning control of finite Markov chains
Automatica (Journal of IFAC)
Value-function reinforcement learning in Markov games
Cognitive Systems Research
Event-learning and robust policy heuristics
Cognitive Systems Research
Optimistic Bayesian sampling in contextual-bandit problems
The Journal of Machine Learning Research
Approximate stochastic annealing for online control of infinite horizon Markov decision processes
Automatica (Journal of IFAC)
Multi-agent task division learning in hide-and-seek games
AIMSA'12 Proceedings of the 15th international conference on Artificial Intelligence: methodology, systems, and applications
A sampled fictitious play based learning algorithm for infinite horizon Markov decision processes
Proceedings of the Winter Simulation Conference
The Journal of Machine Learning Research
A novel reinforcement learning architecture for continuous state and action spaces
Advances in Artificial Intelligence
Reinforcement learning algorithms with function approximation: Recent advances and applications
Information Sciences: an International Journal
Hi-index | 0.01 |
An important application of reinforcement learning(RL) is to finite-state control problems and one of the mostdifficult problems in learning for control is balancing theexploration/exploitation tradeoff. Existing theoretical results forRL give very little guidance on reasonable ways to performexploration. In this paper, we examine the convergence ofsingle-step on-policy RL algorithms for control. On-policyalgorithms cannot separate exploration from learning and thereforemust confront the exploration problem directly. We prove convergenceresults for several related on-policy algorithms with both decayingexploration and persistent exploration. We also provide examples ofexploration strategies that can be followed during learning thatresult in convergence to both optimal values and optimal policies.