Reinforcement Learning Agents

Authors:
C. Ribeiro
Affiliations:
Instituto Tecnológico de Aeronáutica (E-mail: carlos@ita.br)
Venue:
Artificial Intelligence Review
Year:
2002

Citing 30
Cited 7

The complexity of Markov decision processes

Mathematics of Operations Research
Integrated architecture for learning, planning, and reacting based on approximating dynamic programming

Proceedings of the seventh international conference (1990) on Machine learning
Active perception and reinforcement learning

Neural Computation
Automatic programming of behavior-based robots using reinforcement learning

Artificial Intelligence
Practical Issues in Temporal Difference Learning

Machine Learning
Reinforcement learning of non-Markov decision processes

Artificial Intelligence - Special volume on computational research on interaction and agency, part 2
Temporal difference learning and TD-Gammon

Communications of the ACM
A counterexample to temporal differences learning

Neural Computation
Feature-based methods for large scale dynamic programming

Machine Learning - Special issue on reinforcement learning
Incremental multi-step Q-learning

Machine Learning - Special issue on reinforcement learning
Model-based average reward reinforcement learning

Artificial Intelligence
Module-Based Reinforcement Learning: Experiments with a Real Robot

Machine Learning - Special issue on learning in autonomous robots
Connectionist Learning in Behaviour-Based Mobile Robots: A Survey

Artificial Intelligence Review
Analytical Mean Squared Error Curves for Temporal DifferenceLearning

Machine Learning
Dynamic Programming and Optimal Control, Two Volume Set

Dynamic Programming and Optimal Control, Two Volume Set
Neural Networks: A Comprehensive Foundation

Neural Networks: A Comprehensive Foundation
Embedding a Priori Knowledge in Reinforcement Learning

Journal of Intelligent and Robotic Systems
Learning to Predict by the Methods of Temporal Differences

Machine Learning
Artificial Intelligence: A Modern Approach

Artificial Intelligence: A Modern Approach
Generalized Markov Decision Processes: Dynamic-programming and Reinforcement-learning Algorithms

Generalized Markov Decision Processes: Dynamic-programming and Reinforcement-learning Algorithms
Memory Approaches to Reinforcement Learning in Non-Markovian Domains

Memory Approaches to Reinforcement Learning in Non-Markovian Domains
Between MOPs and Semi-MOP: Learning, Planning & Representing Knowledge at Multiple Temporal Scales

Between MOPs and Semi-MOP: Learning, Planning & Representing Knowledge at Multiple Temporal Scales
First Results with Utile Distinction Memory for Reinforcement Learning

First Results with Utile Distinction Memory for Reinforcement Learning
Reinforcement learning with selective perception and hidden state

Reinforcement learning with selective perception and hidden state
Large-scale dynamic optimization using teams of reinforcement learning agents

Large-scale dynamic optimization using teams of reinforcement learning agents
Hierarchical control and learning for markov decision processes

Hierarchical control and learning for markov decision processes
Reinforcement learning in distributed domains: beyond team games

IJCAI'01 Proceedings of the 17th international joint conference on Artificial intelligence - Volume 2
R-MAX: a general polynomial time algorithm for near-optimal reinforcement learning

IJCAI'01 Proceedings of the 17th international joint conference on Artificial intelligence - Volume 2
Rapid, safe, and incremental learning of navigation strategies

IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics
Hidden state and reinforcement learning with instance-based stateidentification

IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics

A Reinforcement Learning Framework for Parameter Control in Computer Vision Applications

CRV '04 Proceedings of the 1st Canadian Conference on Computer and Robot Vision
Using Data Mining Algorithms for Statistical Learning of a Software Agent

KES-AMSTA '07 Proceedings of the 1st KES International Symposium on Agent and Multi-Agent Systems: Technologies and Applications
A new approach to fuzzy classifier systems and its application in self-generating neuro-fuzzy systems

Neurocomputing
A Human-Robot Collaborative Reinforcement Learning Algorithm

Journal of Intelligent and Robotic Systems
Intelligent service-integrated platform based on the RFID technology and software agent system

Expert Systems with Applications: An International Journal
The implementation of Q-learning for problems in continuous state and action space using SOM-based fuzzy systems

ICCOMP'06 Proceedings of the 10th WSEAS international conference on Computers
Knowledge of opposite actions for reinforcement learning

Applied Soft Computing

Quantified Score

Hi-index	0.00

Visualization

Abstract

Reinforcement Learning (RL) is learning through directexperimentation. It does not assume the existence of a teacher thatprovides examples upon which learning of a task takes place. Instead, inRL experience is the only teacher. With historical roots on the study ofbiological conditioned reflexes, RL attracts the interest of Engineersand Computer Scientists because of its theoretical relevance andpotential applications in fields as diverse as Operational Research andIntelligent Robotics.Computationally, RL is intended to operate in a learning environmentcomposed by two subjects: the learner and a dynamic process. Atsuccessive time steps, the learner makes an observation of the processstate, selects an action and applies it back to the process. Its goal isto find out an action policy that controls the behavior of the dynamicprocess, guided by signals (reinforcements) that indicate how badly orwell it has been performing the required task. These signals are usuallyassociated to a dramatic condition – e.g., accomplishment of a subtask(reward) or complete failure (punishment), and the learner tries tooptimize its behavior by using a performance measure (a function of thereceived reinforcements). The crucial point is that in order to do that,the learner must evaluate the conditions (associations between observedstates and chosen actions) that led to rewards or punishments.Starting from basic concepts, this tutorial presents the many flavorsof RL algorithms, develops the corresponding mathematical tools, assesstheir practical limitations and discusses alternatives that have beenproposed for applying RL to realistic tasks.