Learning to Predict by the Methods of Temporal Differences

Authors:
Richard S. Sutton
Affiliations:
GTE Laboratories Incorporated, 40 Sylvan Road, Waltham, MA 02254, U.S.A. RICH@GTE.COM
Venue:
Machine Learning
Year:
1988

Citing 0
Cited 448

Dyna, an integrated architecture for learning, planning, and reacting

ACM SIGART Bulletin
A framework for integrating perception, action, and trial-and-error learning

ACM SIGART Bulletin
Prediction of Software Reliability Using Connectionist Models

IEEE Transactions on Software Engineering
ATM scheduling with queuing delay predictions

SIGCOMM '93 Conference proceedings on Communications architectures, protocols and applications
Efficient reinforcement learning

COLT '94 Proceedings of the seventh annual conference on Computational learning theory
The time dimension of neural network models

ACM SIGART Bulletin
Temporal difference learning and TD-Gammon

Communications of the ACM
Predictive Hebbian learning

COLT '95 Proceedings of the eighth annual conference on Computational learning theory
Markov decision processes in large state spaces

COLT '95 Proceedings of the eighth annual conference on Computational learning theory
Learning curve bounds for a Markov decision process with undiscounted rewards

COLT '96 Proceedings of the ninth annual conference on Computational learning theory
Exploring the Power of Genetic Search in Learning Symbolic Classifiers

IEEE Transactions on Pattern Analysis and Machine Intelligence
Locally Weighted Learning for Control

Artificial Intelligence Review - Special issue on lazy learning
A Teaching Strategy for Memory-Based Control

Artificial Intelligence Review - Special issue on lazy learning
Shifting Inductive Bias with Success-Story Algorithm, AdaptiveLevin Search, and Incremental Self-Improvement

Machine Learning - Special issue on inductive transfer
Mean-field theory for batched TD (&lgr;)

Neural Computation
Explanation-Based Learning and Reinforcement Learning: A Unified View

Machine Learning
Learning agents for uncertain environments (extended abstract)

COLT' 98 Proceedings of the eleventh annual conference on Computational learning theory
An instructable, adaptive interface for discovering and monitoring information on the World-Wide Web

IUI '99 Proceedings of the 4th international conference on Intelligent user interfaces
Learning cooperative lane selection strategies for highways

AAAI '98/IAAI '98 Proceedings of the fifteenth national/tenth conference on Artificial intelligence/Innovative applications of artificial intelligence
Co-Evolution in the Successful Learning of Backgammon Strategy

Machine Learning
On being a teammate: experiences acquired in the design of RoboCup teams

Proceedings of the third annual conference on Autonomous Agents
Fast Online Q(λ)

Machine Learning
Connectionist Learning in Behaviour-Based Mobile Robots: A Survey

Artificial Intelligence Review
Colearning in Differential Games

Machine Learning
Learning Team Strategies: Soccer Case Studies

Machine Learning
Convergence analysis of temporal-difference learning algorithms with linear function approximation

COLT '99 Proceedings of the twelfth annual conference on Computational learning theory
Learning to Take Actions

Machine Learning
Toward a Model of Intelligence as an Economy of Agents

Machine Learning
Analytical Mean Squared Error Curves for Temporal DifferenceLearning

Machine Learning
Structural Results About On-line Learning Models With and Without Queries

Machine Learning
A Nonlinear Noise-Shaping Delta-Sigma Modulator with On-Chip Reinforcement Learning^{*}

Analog Integrated Circuits and Signal Processing - Special issue on Learning on Silicon
Convergence Results for Single-Step On-PolicyReinforcement-Learning Algorithms

Machine Learning
Learning to Play Chess Using Temporal Differences

Machine Learning
On verifying game designs and playing strategies using reinforcement learning

Proceedings of the 2001 ACM symposium on Applied computing
On the Convergence of Temporal-Difference Learning with Linear Function Approximation

Machine Learning
Computer Go: an AI oriented survey

Artificial Intelligence
Reinforcement learning for fuzzy agents: application to a pighouse environment control

New learning paradigms in soft computing
Programming backgammon using self-teaching neural nets

Artificial Intelligence - Chips challenging champions: games, computers and Artificial Intelligence
World-championship-caliber Scrabble

Artificial Intelligence - Chips challenging champions: games, computers and Artificial Intelligence
Learning classifier systems: a complete introduction, review, and roadmap

Journal of Artificial Evolution and Applications
Designing agent collectives for systems with markovian dynamics

Proceedings of the first international joint conference on Autonomous agents and multiagent systems: part 3
The Lagging Anchor Algorithm: Reinforcement Learning in Two-Player Zero-Sum Games with Imperfect Information

Machine Learning
Analog VLSI Stochastic Perturbative Learning Architectures

Analog Integrated Circuits and Signal Processing
A Hybrid Architecture for Situated Learning of Reactive Sequential Decision Making

Applied Intelligence
Research Environment for Data Analysis Tool Allocators

Applied Intelligence
Reinforcement Learning in the Multi-Robot Domain

Autonomous Robots
Learning of Sensor-Based Arm Motions while Executing High-Level Descriptions of Tasks

Autonomous Robots
Reinforcement Learning Soccer Teams with Incomplete World Models

Autonomous Robots
Is It Better to Forget? Stimulus-Response, Prediction, and the Weight of Past Experience in a Fast-Paced Bargaining Task

Computational & Mathematical Organization Theory
Automating the Construction of Internet Portals with Machine Learning

Information Retrieval
Robot Control Optimization Using Reinforcement Learning

Journal of Intelligent and Robotic Systems
A Theoretical Approach of an Intelligent Robot Gripper to Grasp Polygon Shaped Objects

Journal of Intelligent and Robotic Systems
Building a Basic Block Instruction Scheduler with Reinforcement Learning and Rollouts

Machine Learning
Kernel-Based Reinforcement Learning

Machine Learning
On Average Versus Discounted Reward Temporal-Difference Learning

Machine Learning
Near-Optimal Reinforcement Learning in Polynomial Time

Machine Learning
Technical Update: Least-Squares Temporal Difference Learning

Machine Learning
Continuous-Action Q-Learning

Machine Learning
Risk-Sensitive Reinforcement Learning

Machine Learning
Reinforcement Learning Agents

Artificial Intelligence Review
Actor-critic models of the basal ganglia: new anatomical and computational perspectives

Neural Networks - Computational models of neuromodulation
Dopamine: generalization and bonuses

Neural Networks - Computational models of neuromodulation
Opponent interactions between serotonin and dopamine

Neural Networks - Computational models of neuromodulation
Control of exploitation-exploration meta-parameter in reinforcement learning

Neural Networks - Computational models of neuromodulation
Neural computing increases robot adaptivity

Natural Computing: an international journal
Relative Loss Bounds for Temporal-Difference Learning

Machine Learning
From Perturbation Analysis to Markov Decision Processes and Reinforcement Learning

Discrete Event Dynamic Systems
Least Squares Policy Evaluation Algorithms with Linear Function Approximation

Discrete Event Dynamic Systems
Learning Situation-Specific Coordination in Cooperative Multi-agent Systems

Autonomous Agents and Multi-Agent Systems
Experiences Acquired in the Design of RoboCup Teams: A Comparison of Two Fielded Teams

Autonomous Agents and Multi-Agent Systems
A Framework for Learning in Search-Based Systems

IEEE Transactions on Knowledge and Data Engineering
Learning Optimal Robotic Tasks

IEEE Expert: Intelligent Systems and Their Applications
Using Machine Learning Techniques in Real-World Mobile Robots

IEEE Expert: Intelligent Systems and Their Applications
A computational model of learned avoidance behavior in a one-way avoidance experiment

Adaptive Behavior
Multiple model-based reinforcement learning

Neural Computation
A personalized and integrative comparison-shopping engine and its applications

Decision Support Systems - Special issue: Agents and e-commerce business models
Long-term reward prediction in TD models of the dopamine system

Neural Computation
Using a time-delay actor-critic neural architecture with dopamine-like reinforcement signal for learning in autonomous robots

Emergent neural computational architectures based on neuroscience
Learning to play strong poker

Machines that learn to play games
Minimax TD-Learning with Neural Nets in a Markov Game

ECML '00 Proceedings of the 11th European Conference on Machine Learning
A Reinforcement Learning Algorithm Applied to Simplified Two-Player Texas Hold'em Poker

EMCL '01 Proceedings of the 12th European Conference on Machine Learning
Learning While Exploring: Bridging the Gaps in the Eligibility Traces

EMCL '01 Proceedings of the 12th European Conference on Machine Learning
Propagation of Q-values in Tabular TD(lambda)

ECML '02 Proceedings of the 13th European Conference on Machine Learning
Learning a Navigation Task in Changing Environments by Multi-task Reinforcement Learning

EWLR-8 Proceedings of the 8th European Workshop on Learning Robots: Advances in Robot Learning
Reinforcement Learning in Situated Agents: Theoretical and Practical Solutions

EWLR-8 Proceedings of the 8th European Workshop on Learning Robots: Advances in Robot Learning
Solving Partially Observable Problems by Evolution and Learning of Finite State Machines

ICES '01 Proceedings of the 4th International Conference on Evolvable Systems: From Biology to Hardware
ADORE: Adaptive Object Recognition

ICVS '99 Proceedings of the First International Conference on Computer Vision Systems
Minimizing Transmission Costs through Adaptive Marking in Differentiated Services Networks

MMNS '02 Proceedings of the 5th IFIP/IEEE International Conference on Management of Multimedia Networks and Services: Management of Multimedia on the Internet
Reinforcement Learning: Past, Present and Future

SEAL'98 Selected papers from the Second Asia-Pacific Conference on Simulated Evolution and Learning on Simulated Evolution and Learning
Modelling Intelligent Behaviour: The Markov Decision Process Approach

IBERAMIA '98 Proceedings of the 6th Ibero-American Conference on AI: Progress in Artificial Intelligence
Minimax Fuzzy Q-Learning in Cooperative Multi-agent Systems

ADVIS '02 Proceedings of the Second International Conference on Advances in Information Systems
Unsupervised Learning in Metagame

AI '99 Proceedings of the 12th Australian Joint Conference on Artificial Intelligence: Advanced Topics in Artificial Intelligence
Parameterized Logic Programs where Computing Meets Learning

FLOPS '01 Proceedings of the 5th International Symposium on Functional and Logic Programming
Reinforcement Learning for Control of Traffic and Access Points in Intelligent Wireless ATM Networks

Proceedings of the International Conference, 7th Fuzzy Days on Computational Intelligence, Theory and Applications
Meta-case-Based Reasoning: Using Functional Models to Adapt Case-Based Agents

ICCBR '01 Proceedings of the 4th International Conference on Case-Based Reasoning: Case-Based Reasoning Research and Development
Market-Based Reinforcement Learning in Partially Observable Worlds

ICANN '01 Proceedings of the International Conference on Artificial Neural Networks
An Introduction to Learning Fuzzy Classifier Systems

Learning Classifier Systems, From Foundations to Applications
Fuzzy and Crisp Representations of Real-Valued Input for Learning Classifier Systems

Learning Classifier Systems, From Foundations to Applications
A Roadmap to the Last Decade of Learning Classifier System Research

Learning Classifier Systems, From Foundations to Applications
Market Performance of Adaptive Trading Agents in Synchronous Double Auctions

PRIMA 2001 Proceedings of the 4th Pacific Rim International Workshop on Multi-Agents, Intelligent Agents: Specification, Modeling, and Applications
On the Asymptotic Behaviour of a Constant Stepsize Temporal-Difference Learning Algorithm

EuroCOLT '99 Proceedings of the 4th European Conference on Computational Learning Theory
Application of Episodic Q-Learning to a Multi-agent Cooperative Task

PRICAI '02 Proceedings of the 7th Pacific Rim International Conference on Artificial Intelligence: Trends in Artificial Intelligence
On the Need for a Neural Abstract Machine

Sequence Learning - Paradigms, Algorithms, and Applications
Sequential Decision Making Based on Direct Search

Sequence Learning - Paradigms, Algorithms, and Applications
Using a Time-Delay Actor-Critic Neural Architecture with Dopamine-Like Reinforcement Signal for Learning in Autonomous Robots

Emergent Neural Computational Architectures Based on Neuroscience - Towards Neuroscience-Inspired Computing
First Results from Using Temporal Difference Learning in Shogi

CG '98 Proceedings of the First International Conference on Computers and Games
A Least-Certainty Heuristic for Selective Search

CG '00 Revised Papers from the Second International Conference on Computers and Games
Chess Neighborhoods, Function Combination, and Reinforcement Learning

CG '00 Revised Papers from the Second International Conference on Computers and Games
Feedforward Neural Networks in Reinforcement Learning Applied to High-Dimensional Motor Control

ALT '02 Proceedings of the 13th International Conference on Algorithmic Learning Theory
Unsupervised Learning of Biologically Plausible Object Recognition Strategies

BMVC '00 Proceedings of the First IEEE International Workshop on Biologically Motivated Computer Vision
Timed delivery of reward signals in an autonomous robot

ICSAB Proceedings of the seventh international conference on simulation of adaptive behavior on From animals to animats
Isotropic sequence order learning

Neural Computation
Nonlinear credit assignment for musical sequences

Second international workshop on Intelligent systems design and application
Evolution of reinforcement learning in uncertain environments: a simple explanation for complex foraging behaviors

Adaptive Behavior
The design of collectives of agents to control non-Markovian systems

Eighteenth national conference on Artificial intelligence
A System for Building Intelligent Agents that Learn to Retrieve and Extract Information

User Modeling and User-Adapted Interaction
A non-computationally-intensive neurocontroller for autonomous mobile robot navigation

Biologically inspired robot behavior engineering
Cognitive Packet Networks

ICTAI '99 Proceedings of the 11th IEEE International Conference on Tools with Artificial Intelligence
Grasp learning by active experimentation using continuous B-spline model

Autonomous robotic systems
Offline learning and the role of autogenous speech: new suggestions from birdsong research

Speech Communication - Special issue on the nature of speech perception (the psychophysics of speech perception III)
Recent trends in learning classifier systems research

Advances in evolutionary computing
Reinforcement learning based on local state feature learning and policy adjustment

Information Sciences—Informatics and Computer Science: An International Journal - Special issue: Introduction to multimedia and mobile agents
Autonomous mental development in high dimensional context and action spaces

Neural Networks - 2003 Special issue: Advances in neural networks research — IJCNN'03
Learning evaluation functions to improve optimization by local search

The Journal of Machine Learning Research
Learning-assisted automated planning: looking back, taking stock, going forward

AI Magazine
Fault prognostics using dynamic wavelet neural networks

Artificial Intelligence for Engineering Design, Analysis and Manufacturing
Combining importance sampling and temporal difference control variates to simulate Markov Chains

ACM Transactions on Modeling and Computer Simulation (TOMACS)
Autonomous Learning Architecture for Environmental Mapping

Journal of Intelligent and Robotic Systems
How the shape of pre- and postsynaptic signals can influence STDP: a biophysical model

Neural Computation
An experimental evaluation of reinforcement learning for gain scheduling

Design and application of hybrid intelligent systems
Emotion-based hierarchical reinforcement learning

Design and application of hybrid intelligent systems
Convergence of synchronous reinforcement learning with linear function approximation

ICML '04 Proceedings of the twenty-first international conference on Machine learning
Using adaptive routing to achieve quality of service

Performance Evaluation
A multi-agent system integrating reinforcement learning, bidding and genetic algorithms

Web Intelligence and Agent Systems
Reinforcement Learning with Factored States and Actions

The Journal of Machine Learning Research
Unifying Temporal and Structural Credit Assignment Problems

AAMAS '04 Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems - Volume 2
Link quality-aware routing

PE-WASUN '04 Proceedings of the 1st ACM international workshop on Performance evaluation of wireless ad hoc, sensor, and ubiquitous networks
Planning, learning and coordination in multiagent decision processes

TARK '96 Proceedings of the 6th conference on Theoretical aspects of rationality and knowledge
Variance Reduction Techniques for Gradient Estimates in Reinforcement Learning

The Journal of Machine Learning Research
Reliability of internal prediction/estimation and its application: I. adaptive action selection reflecting reliability of value function

Neural Networks
Simulating autonomous agents in augmented reality

Journal of Systems and Software
Review of "Genetic Algorithms for Machine Learning by John J. Greffenstette", Kluwer Academic Publishers, 1993.

ACM SIGART Bulletin
System for foreign exchange trading using genetic algorithms and reinforcement learning

International Journal of Systems Science
Designing intelligent sales-agent for online selling

ICEC '05 Proceedings of the 7th international conference on Electronic commerce
Reinforcement learning for active model selection

UBDM '05 Proceedings of the 1st international workshop on Utility-based data mining
Cooperative Multi-Agent Learning: The State of the Art

Autonomous Agents and Multi-Agent Systems
A Computational Model of Learned Avoidance Behavior in a One-Way Avoidance Experiment

Adaptive Behavior - Animals, Animats, Software Agents, Robots, Adaptive Systems
Implementing Temporal-Difference Learning with the Scaled Conjugate Gradient Algorithm

Neural Processing Letters
TD(λ) networks: temporal-difference networks with eligibility traces

ICML '05 Proceedings of the 22nd international conference on Machine learning
Online Evolution for a Self-Adapting Robotic Navigation System Using Evolvable Hardware

Artificial Life
Hybrid least-squares methods for reinforcement learning

IEA/AIE'2003 Proceedings of the 16th international conference on Developments in applied artificial intelligence
Making Working Memory Work: A Computational Model of Learning in the Prefrontal Cortex and Basal Ganglia

Neural Computation
Temporal Sequence Learning, Prediction, and Control: A Review of Different Models and Their Relation to Biological Mechanisms

Neural Computation
A Computational Model of the Functional Role of the Ventral-Striatal D2 Receptor in the Expression of Previously Acquired Behaviors

Neural Computation
Spike-Timing-Dependent Hebbian Plasticity as Temporal Difference Learning

Neural Computation
A Model of Invariant Object Recognition in the Visual System: Learning Rules, Activation Functions, Lateral Inhibition, and Information-Based Performance Measures

Neural Computation
Reinforcement Learning in Continuous Time and Space

Neural Computation
A Generalized Kalman Filter for Fixed Point Approximation and Efficient Temporal-Difference Learning

Discrete Event Dynamic Systems
Asymptotic analysis of temporal-difference learning algorithms with constant step-sizes

Machine Learning
Tuning evaluation functions by maximizing concordance

Theoretical Computer Science - Advances in computer games
Declarative Optimization-Based Drama Management in Interactive Fiction

IEEE Computer Graphics and Applications
Universal parameter optimisation in games based on SPSA

Machine Learning
Automatic basis function construction for approximate dynamic programming and reinforcement learning

ICML '06 Proceedings of the 23rd international conference on Machine learning
Strongly improved stability and faster convergence of temporal sequence learning by using input correlations only

Neural Computation
Strongly improved stability and faster convergence of temporal sequence learning by using input correlations only

Neural Computation
A Model of Prefrontal Cortical Mechanisms for Goal-directed Behavior

Journal of Cognitive Neuroscience
Dynamic Dopamine Modulation in the Basal Ganglia: A Neurocomputational Account of Cognitive Deficits in Medicated and Nonmedicated Parkinsonism

Journal of Cognitive Neuroscience
Reinforcement learning for declarative optimization-based drama management

AAMAS '06 Proceedings of the fifth international joint conference on Autonomous agents and multiagent systems
Representation and timing in theories of the dopamine system

Neural Computation
A Computational Role for Dopamine Delivery in Human Decision-Making

Journal of Cognitive Neuroscience
A Computational Model of How the Basal Ganglia Produce Sequences

Journal of Cognitive Neuroscience
A Neural Learning Classifier System with Self-Adaptive Constructivism for Mobile Robot Control

Artificial Life
A Neuro-Dynamic Programming-Based Optimal Controller for Tomato Seedling Growth in Greenhouse Systems

Neural Processing Letters
Hold your horses: a dynamic computational role for the subthalamic nucleus in decision making

Neural Networks - 2006 Special issue: Neurobiology of decision making
The misbehavior of value and the discipline of the will

Neural Networks - 2006 Special issue: Neurobiology of decision making
Neural systems implicated in delayed and probabilistic reinforcement

Neural Networks - 2006 Special issue: Neurobiology of decision making
A fast finite-state relaxation method for enforcing global constraints on sequence decoding

HLT-NAACL '06 Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics
Reinforcement Learning with Approximation Spaces

Fundamenta Informaticae
Learning movement sequences with a delayed reward signal in a hierarchical model of motor function

Neural Networks
Free-lunch learning: modeling spontaneous recovery of memory

Neural Computation
Temporal pattern identification using spike-timing dependent plasticity

Neurocomputing
Performance Loss Bounds for Approximate Value Iteration with State Aggregation

Mathematics of Operations Research
Reinforcement Learning Through Modulation of Spike-Timing-Dependent Synaptic Plasticity

Neural Computation
Evolutionary Function Approximation for Reinforcement Learning

The Journal of Machine Learning Research
Combining online and offline knowledge in UCT

Proceedings of the 24th international conference on Machine learning
Analyzing feature generation for value-function approximation

Proceedings of the 24th international conference on Machine learning
On the role of tracking in stationary environments

Proceedings of the 24th international conference on Machine learning
Empirical Studies in Action Selection with Reinforcement Learning

Adaptive Behavior - Animals, Animats, Software Agents, Robots, Adaptive Systems
An Action-Selection Calculus

Adaptive Behavior - Animals, Animats, Software Agents, Robots, Adaptive Systems
Guiding exploration by pre-existing knowledge without modifying reward

Neural Networks
Multiple model-based reinforcement learning explains dopamine neuronal activity

Neural Networks
A framework for meta-level control in multi-agent systems

Autonomous Agents and Multi-Agent Systems
Learning with “Relevance”: Using a Third Factor to Stabilize Hebbian Learning

Neural Computation
The design and evaluation of an intelligent sales agent for online persuasion and negotiation

Electronic Commerce Research and Applications
Model-Based Reinforcement Learning for Partially Observable Games with Sampling-Based State Estimation

Neural Computation
Restricted gradient-descent algorithm for value-function approximation in reinforcement learning

Artificial Intelligence
Learning how to combine sensory-motor functions into a robust behavior

Artificial Intelligence
Accelerating autonomous learning by using heuristic selection of actions

Journal of Heuristics
Learning perceptually salient visual parameters using spatiotemporal smoothness constraints

Neural Computation
On the convergence of stochastic iterative dynamic programming algorithms

Neural Computation
Improving generalization for temporal difference learning: The successor representation

Neural Computation
Source routing in the internet with reinforcement learning and genetic algorithms

SEPADS'06 Proceedings of the 5th WSEAS International Conference on Software Engineering, Parallel and Distributed Systems
Knowledge propagation in a distributed omnidirectional vision system

Journal of Intelligent & Fuzzy Systems: Applications in Engineering and Technology - Marco Somalvico Memorial Issue
Cooperation learning in Multi-Agent Systems with annotation and reward

International Journal of Knowledge-based and Intelligent Engineering Systems
Coordinating with the Future: The Anticipatory Nature of Representation

Minds and Machines
Real-time dynamic fuzzy Q-learning and control of mobile robots

ICECS'03 Proceedings of the 2nd WSEAS International Conference on Electronics, Control and Signal Processing
Reinforcement learning in the presence of rare events

Proceedings of the 25th international conference on Machine learning
A worst-case comparison between temporal difference and residual gradient with linear function approximation

Proceedings of the 25th international conference on Machine learning
An analysis of linear models, linear value-function approximation, and feature selection for reinforcement learning

Proceedings of the 25th international conference on Machine learning
Sample-based learning and search with permanent and transient memories

Proceedings of the 25th international conference on Machine learning
Preconditioned temporal difference learning

Proceedings of the 25th international conference on Machine learning
Differences in prefrontal and motor structures learning dynamics depend on task complexity: A neural network model

Neurocomputing
Sigma point policy iteration

Proceedings of the 7th international joint conference on Autonomous agents and multiagent systems - Volume 1
Some approaches concerning autonomous mobile robot control

ICAI'08 Proceedings of the 9th WSEAS International Conference on International Conference on Automation and Information
Evolutionary Approach to the Game of Checkers

ICANNGA '07 Proceedings of the 8th international conference on Adaptive and Natural Computing Algorithms, Part I
Postural Control of Two-Stage Inverted Pendulum Using Reinforcement Learning and Self-organizing Map

ICANNGA '07 Proceedings of the 8th international conference on Adaptive and Natural Computing Algorithms, Part II
Imitative Reinforcement Learning for Soccer Playing Robots

RoboCup 2006: Robot Soccer World Cup X
A Kernel-Based Reinforcement Learning Approach to Dynamic Behavior Modeling of Intrusion Detection

ISNN '07 Proceedings of the 4th international symposium on Neural Networks: Advances in Neural Networks
Feed-Forward Learning: Fast Reinforcement Learning of Controllers

IWINAC '07 Proceedings of the 2nd international work-conference on Nature Inspired Problem-Solving Methods in Knowledge Engineering: Interplay Between Natural and Artificial Computation, Part II
Learning How to Play Hex

KI '07 Proceedings of the 30th annual German conference on Advances in Artificial Intelligence
Uncovering Fraud in Direct Marketing Data with a Fraud Auditing Case Builder

PKDD 2007 Proceedings of the 11th European conference on Principles and Practice of Knowledge Discovery in Databases
Evolution of Valence Systems in an Unstable Environment

SAB '08 Proceedings of the 10th international conference on Simulation of Adaptive Behavior: From Animals to Animats
Learning to Generalize through Predictive Representations: A Computational Model of Mediated Conditioning

SAB '08 Proceedings of the 10th international conference on Simulation of Adaptive Behavior: From Animals to Animats
Mixture of Expert Used to Learn Game Play

ICANN '08 Proceedings of the 18th international conference on Artificial Neural Networks, Part I
Reinforcement Learning: Insights from Interesting Failures in Parameter Selection

Proceedings of the 10th international conference on Parallel Problem Solving from Nature: PPSN X
Robustness Analysis of SARSA(λ): Different Models of Reward and Initialisation

AIMSA '08 Proceedings of the 13th international conference on Artificial Intelligence: Methodology, Systems, and Applications
Player Co-Modelling in a Strategy Board Game: Discovering How to Play Fast

Cybernetics and Systems
Stimulus representation and the timing of reward-prediction errors in models of the dopamine system

Neural Computation
Using temporal-difference learning for multi-agent bargaining

Electronic Commerce Research and Applications
Experimental analysis of eligibility traces strategies in temporal difference learning

International Journal of Knowledge Engineering and Soft Data Paradigms
Learning the Filling Policy of a Biodegradation Process by Fuzzy Actor---Critic Learning Methodology

MICAI '08 Proceedings of the 7th Mexican International Conference on Artificial Intelligence: Advances in Artificial Intelligence
Semantic Relatedness Measure Using Object Properties in an Ontology

ISWC '08 Proceedings of the 7th International Conference on The Semantic Web
New Error Bounds for Approximations from Projected Linear Equations

Recent Advances in Reinforcement Learning
Simulation and reinforcement learning with soccer agents

Multiagent and Grid Systems - Innovations in intelligent agent technology
A spiking neural network model of an actor-critic learning agent

Neural Computation
Experimental analysis on Sarsa(λ) and Q(λ) with different eligibility traces strategies

Journal of Intelligent & Fuzzy Systems: Applications in Engineering and Technology - Theoretical advances of intelligent paradigms
Projected equation methods for approximate solution of large linear systems

Journal of Computational and Applied Mathematics
Reinforcement distribution in fuzzy Q-learning

Fuzzy Sets and Systems
Basal Ganglia Models for Autonomous Behavior Learning

Creating Brain-Like Intelligence
A survey of message diffusion protocols in mobile ad hoc networks

Proceedings of the 3rd International Conference on Performance Evaluation Methodologies and Tools
Defining Transition Rules with Reinforcement Learning for Modeling Land Cover Change

Simulation
2009 Special Issue: Adaptive learning via selectionism and Bayesianism, Part II: The sequential case

Neural Networks
2009 Special Issue: Adaptive learning via selectionism and Bayesianism, Part I: Connection between the two

Neural Networks
Regularization and feature selection in least-squares temporal difference learning

ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Fast gradient-descent methods for temporal-difference learning with linear function approximation

ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Performance Evaluation of Direct Heuristic Dynamic Programming using Control-Theoretic Measures

Journal of Intelligent and Robotic Systems
SarsaLandmark: an algorithm for learning in POMDPs with landmarks

Proceedings of The 8th International Conference on Autonomous Agents and Multiagent Systems - Volume 1
Evolving Computer Game Playing via Human-Computer Interaction: Machine Learning Tools in the Knowledge Engineering Life-Cycle

Proceedings of the 2008 conference on Knowledge-Based Software Engineering: Proceedings of the Eighth Joint Conference on Knowledge-Based Software Engineering
Neuroevolutionary reinforcement learning for generalized helicopter control

Proceedings of the 11th Annual conference on Genetic and evolutionary computation
Reinforcement learning for games: failures and successes

Proceedings of the 11th Annual Conference Companion on Genetic and Evolutionary Computation Conference: Late Breaking Papers
Learning Representation and Control in Markov Decision Processes: New Frontiers

Foundations and Trends® in Machine Learning
2009 Special Issue: A new bidirectional heteroassociative memory encompassing correlational, competitive and topological properties

Neural Networks
Incremental least-squares temporal difference learning

AAAI'06 Proceedings of the 21st national conference on Artificial intelligence - Volume 1
Variational Bayesian learning of nonlinear hidden state-space models for model predictive control

Neurocomputing
Machine learning in digital games: a survey

Artificial Intelligence Review
Hybrid least-squares algorithms for approximate policy evaluation

Machine Learning
Geometric variance reduction in Markov chains: application to value function and gradient estimation

AAAI'05 Proceedings of the 20th national conference on Artificial intelligence - Volume 2
Efficient reinforcement learning with relocatable action models

AAAI'07 Proceedings of the 22nd national conference on Artificial intelligence - Volume 1
Achieving master level play in 9×9 computer go

AAAI'08 Proceedings of the 23rd national conference on Artificial intelligence - Volume 3
Efficient reinforcement learning using recursive least-squares methods

Journal of Artificial Intelligence Research
Collective intelligence, data routing and braess' paradox

Journal of Artificial Intelligence Research
Efficient solution algorithms for factored MDPs

Journal of Artificial Intelligence Research
Accelerating reinforcement learning through implicit imitation

Journal of Artificial Intelligence Research
Cooperative information sharing to improve distributed learning in multi-agent systems

Journal of Artificial Intelligence Research
Closed-loop learning of visual control policies

Journal of Artificial Intelligence Research
Statistical feature combination for the evaluation of game positions

Journal of Artificial Intelligence Research
Reinforcement learning: a survey

Journal of Artificial Intelligence Research
On partially controlled multi-agent systems

Journal of Artificial Intelligence Research
Truncating temporal differences: on the efficient implementation of TD (λ) for reinforcement learning

Journal of Artificial Intelligence Research
Experiments with infinite-horizon, policy-gradient estimation

Journal of Artificial Intelligence Research
Alternatives for classifier system credit assignment

IJCAI'89 Proceedings of the 11th international joint conference on Artificial intelligence - Volume 1
Integrating inductive neural network learning and explanation-based learning

IJCAI'93 Proceedings of the 13th international joint conference on Artifical intelligence - Volume 2
Temporal coherence and prediction decay in TD learning

IJCAI'99 Proceedings of the 16th international joint conference on Artifical intelligence - Volume 1
Convergence of reinforcement learning with general function approximators

IJCAI'99 Proceedings of the 16th international joint conference on Artificial intelligence - Volume 2
Learning and multiagent reasoning for autonomous agents

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Context-driven predictions

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Heuristic selection of actions in multiagent reinforcement learning

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Analogical learning in a turn-based strategy game

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Transfer learning in real-time strategy games using hybrid CBR/RL

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Reinforcement learning of local shape in the game of go

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Dynamics of temporal difference learning

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Input generalization in delayed reinforcement learning: an algorithm and performance comparisons

IJCAI'91 Proceedings of the 12th international joint conference on Artificial intelligence - Volume 2
Natural actor-critic algorithms

Automatica (Journal of IFAC)
NORN finance forecaster: a neural oscillatory-based recurrent network for finance prediction

IJCAI'01 Proceedings of the 17th international joint conference on Artificial intelligence - Volume 2
Robot weightlifting by direct policy search

IJCAI'01 Proceedings of the 17th international joint conference on Artificial intelligence - Volume 2
From Q(λ) to average Q-learning: efficient implementation of an asymptotic approximation

IJCAI'01 Proceedings of the 17th international joint conference on Artificial intelligence - Volume 2
Approximating optimal policies for partially observable stochastic domains

IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 2
A reinforcement learning approach to job-shop scheduling

IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 2
Learning classifier systems: a complete introduction, review, and roadmap

Journal of Artificial Evolution and Applications
Learning to act using real-time dynamic programming

Artificial Intelligence
Dynamic Application Profiles using Neural Networks for adaptive quality of service support in the Internet

Computer Communications
Assured end-to-end QoS through adaptive marking in multi-domain differentiated services networks

Computer Communications
Ant colony optimization incorporated with fuzzy Q-learning for reinforcement fuzzy control

IEEE Transactions on Systems, Man, and Cybernetics, Part A: Systems and Humans
BAM learning of nonlinearly separable tasks by using an asymmetrical output function and reinforcement learning

IEEE Transactions on Neural Networks
Real-time reinforcement learning by sequential Actor-Critics and experience replay

Neural Networks
Reinforcement learning and adaptive dynamic programming for feedback control

IEEE Circuits and Systems Magazine
A Model of Neuronal Specialization Using Hebbian Policy-Gradient with "Slow" Noise

ICANN '09 Proceedings of the 19th International Conference on Artificial Neural Networks: Part I
Multiagent Reinforcement Learning with Spiking and Non-Spiking Agents in the Iterated Prisoner's Dilemma

ICANN '09 Proceedings of the 19th International Conference on Artificial Neural Networks: Part I
Reinforcement Learning Based Web Service Compositions for Mobile Business

WISM '09 Proceedings of the International Conference on Web Information Systems and Mining
Acquisition of Movement Pattern by Q-Learning in Peristaltic Crawling Robot

ICIRA '09 Proceedings of the 2nd International Conference on Intelligent Robotics and Applications
Using continuous action spaces to solve discrete problems

IJCNN'09 Proceedings of the 2009 international joint conference on Neural Networks
Bidirectional associative memories, self-organizing maps and k-winners-take-all: uniting feature extraction and topological principles

IJCNN'09 Proceedings of the 2009 international joint conference on Neural Networks
On the asymptotic equivalence between differential Hebbian and temporal difference learning

Neural Computation
Adaptive state space partitioning for reinforcement learning

Engineering Applications of Artificial Intelligence
Derivatives of logarithmic stationary distributions for policy gradient reinforcement learning

Neural Computation
An adaptive inventory control for a supply chain

CCDC'09 Proceedings of the 21st annual international conference on Chinese control and decision conference
Applying artificial neural network combined with TD (λ) to computer Chinese chess

CCDC'09 Proceedings of the 21st annual international conference on Chinese control and decision conference
Coevolutionary temporal difference learning for Othello

CIG'09 Proceedings of the 5th international conference on Computational Intelligence and Games
Direct heuristic dynamic programming for nonlinear tracking control with filtered tracking error

IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics
The improvement of Q-learning applied to imperfect information game

SMC'09 Proceedings of the 2009 IEEE international conference on Systems, Man and Cybernetics
A general fuzzified CMAC based reinforcement learning control for ship steering using recursive least-squares algorithm

Neurocomputing
Motion control of a nonlinear spring by reinforcement learning

Control and Intelligent Systems
QoS differentiated and fair packet scheduling in broadband wireless access networks

EURASIP Journal on Wireless Communications and Networking - Special issue on broadband wireless access
A new approach to fuzzy classifier systems and its application in self-generating neuro-fuzzy systems

Neurocomputing
Reinforcement learning combined with a fuzzy adaptive learning control network (FALCON-R) for pattern classification

Pattern Recognition
Review article: Synergizing reinforcement learning and game theory-A new direction for control

Applied Soft Computing
Sequential anomaly detection based on temporal-difference learning: Principles, models and case studies

Applied Soft Computing
Transfer Learning for Reinforcement Learning Domains: A Survey

The Journal of Machine Learning Research
A comparison of selected training algorithms for recurrent neural networks

PRICAI'06 Proceedings of the 9th Pacific Rim international conference on Artificial intelligence
Counter example for Q-bucket-brigade under prediction problem

IWLCS'03-05 Proceedings of the 2003-2005 international conference on Learning classifier systems
Experience-based reinforcement learning to acquire effective behavior in a multi-agent domain

PRICAI'00 Proceedings of the 6th Pacific Rim international conference on Artificial intelligence
Acceleration of game learning with prediction-based reinforcement learning: toward the emergence of planning behavior

ICANN/ICONIP'03 Proceedings of the 2003 joint international conference on Artificial neural networks and neural information processing
Self-aware networks and quality of service

ICANN/ICONIP'03 Proceedings of the 2003 joint international conference on Artificial neural networks and neural information processing
Q-learning with linear function approximation

COLT'07 Proceedings of the 20th annual conference on Learning theory
Posterior weighted reinforcement learning with state uncertainty

Neural Computation
A computational neural model of goal-directed utterance selection

Neural Networks
Convergence analysis on approximate reinforcement learning

KSEM'07 Proceedings of the 2nd international conference on Knowledge science, engineering and management
Efficient selectivity and backup operators in Monte-Carlo tree search

CG'06 Proceedings of the 5th international conference on Computers and games
Kernel-based online NEAT for keepaway soccer

LSMS'07 Proceedings of the Life system modeling and simulation 2007 international conference on Bio-Inspired computational intelligence and applications
Temporal difference learning and simulated annealing for optimal control: a case study

KES-AMSTA'08 Proceedings of the 2nd KES International conference on Agent and multi-agent systems: technologies and applications
Relational macros for transfer in reinforcement learning

ILP'07 Proceedings of the 17th international conference on Inductive logic programming
Resource state prediction in the grid based on neural network

ICNC'09 Proceedings of the 5th international conference on Natural computation
Adaptive multi-modal sensors

50 years of artificial intelligence
Improving optimistic exploration in model-free reinforcement learning

ICANNGA'09 Proceedings of the 9th international conference on Adaptive and natural computing algorithms
Online learning in autonomic multi-hop wireless networks for transmitting mission-critical applications

IEEE Journal on Selected Areas in Communications
Critical factors in the empirical performance of temporal difference and evolutionary methods for reinforcement learning

Autonomous Agents and Multi-Agent Systems
Coevolution in a large search space using resource-limited nash memory

Proceedings of the 12th annual conference on Genetic and evolutionary computation
An activation reinforcement based classifier system for balancing generalisation and specialisation (ARCS)

Proceedings of the 12th annual conference companion on Genetic and evolutionary computation
Error Bounds for Approximations from Projected Linear Equations

Mathematics of Operations Research
Reinforcement learning of competitive and cooperative skills in soccer agents

Applied Soft Computing
Evolving Static Representations for Task Transfer

The Journal of Machine Learning Research
Learning to coordinate behaviors

AAAI'90 Proceedings of the eighth National conference on Artificial intelligence - Volume 2
Explaining temporal differences to create useful concepts for evaluating states

AAAI'90 Proceedings of the eighth National conference on Artificial intelligence - Volume 2
Two kinds of training information for evaluation function learning

AAAI'91 Proceedings of the ninth National conference on Artificial intelligence - Volume 2
Adaptive pattern-oriented chess

AAAI'91 Proceedings of the ninth National conference on Artificial intelligence - Volume 2
A complexity analysis of cooperative mechanisms in reinforcement learning

AAAI'91 Proceedings of the ninth National conference on Artificial intelligence - Volume 2
Programming robots using reinforcement learning and teaching

AAAI'91 Proceedings of the ninth National conference on Artificial intelligence - Volume 2
Reinforcement learning with a hierarchy of abstract models

AAAI'92 Proceedings of the tenth national conference on Artificial intelligence
Machine learning for intelligent systems

AAAI'97/IAAI'97 Proceedings of the fourteenth national conference on artificial intelligence and ninth conference on Innovative applications of artificial intelligence
Learning to play hearts

AAAI'97/IAAI'97 Proceedings of the fourteenth national conference on artificial intelligence and ninth conference on Innovative applications of artificial intelligence
On the characteristics of sequential decision problems and their impact on evolutionary computation and reinforcement learning

EA'09 Proceedings of the 9th international conference on Artificial evolution
Why and how hippocampal transition cells can be used in reinforcement learning

SAB'10 Proceedings of the 11th international conference on Simulation of adaptive behavior: from animals to animats
The layered learning method and its application to generation of evaluation functions for the game of checkers

PPSN'10 Proceedings of the 11th international conference on Parallel problem solving from nature: Part II
Adaptive critic design with ESN critic for bioprocess optimization

ICANN'10 Proceedings of the 20th international conference on Artificial neural networks: Part II
Simultaneous learning of perception and action in mobile robots

Robotics and Autonomous Systems
Automatic induction of bellman-error features for probabilistic planning

Journal of Artificial Intelligence Research
Learning to take actions

AAAI'96 Proceedings of the thirteenth national conference on Artificial intelligence - Volume 1
Dopamine and self-directed learning

Proceedings of the 2010 conference on Biologically Inspired Cognitive Architectures 2010: Proceedings of the First Annual Meeting of the BICA Society
Parallel reinforcement learning with linear function approximation

ALAMAS'05/ALAMAS'06/ALAMAS'07 Proceedings of the 5th , 6th and 7th European conference on Adaptive and learning agents and multi-agent systems: adaptation and multi-agent learning
Web-based multi-agent system architecture in a dynamic environment

International Journal of Knowledge-based and Intelligent Engineering Systems
Learning the behavior model of a robot

Autonomous Robots
Multiobjective reinforcement learning for traffic signal control using vehicular ad hoc network

EURASIP Journal on Advances in Signal Processing - Special title on vehicular ad hoc networks
Robust high performance reinforcement learning through weighted k-nearest neighbors

Neurocomputing
Supporting smart interactions with predictive analytics

The smart internet
The implementation of Q-learning for problems in continuous state and action space using SOM-based fuzzy systems

ICCOMP'06 Proceedings of the 10th WSEAS international conference on Computers
Supporting smart interactions with predictive analytics

The smart internet
Ambulance redeployment: an approximate dynamic programming approach

Winter Simulation Conference
Learning n-tuple networks for othello by coevolutionary gradient search

Proceedings of the 13th annual conference on Genetic and evolutionary computation
Training neural networks to play backgammon variants using reinforcement learning

EvoApplications'11 Proceedings of the 2011 international conference on Applications of evolutionary computation - Volume Part I
Generalized TD Learning

The Journal of Machine Learning Research
Exploiting Best-Match Equations for Efficient Reinforcement Learning

The Journal of Machine Learning Research
Horde: a scalable real-time architecture for learning knowledge from unsupervised sensorimotor interaction

The 10th International Conference on Autonomous Agents and Multiagent Systems - Volume 2
Preference-based policy iteration: leveraging preference learning for reinforcement learning

ECML PKDD'11 Proceedings of the 2011 European conference on Machine learning and knowledge discovery in databases - Volume Part I
A methodology for learning players| styles from game records

International Journal of Artificial Intelligence and Soft Computing
Learning finite-state controllers for partially observable environments

UAI'99 Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence
Value function approximation in zero-sum markov games

UAI'02 Proceedings of the Eighteenth conference on Uncertainty in artificial intelligence
Quantum reinforcement learning

ICNC'05 Proceedings of the First international conference on Advances in Natural Computation - Volume Part II
A novel self-organizing neural fuzzy network for automatic generation of fuzzy inference systems

ISNN'05 Proceedings of the Second international conference on Advances in Neural Networks - Volume Part I
Skill acquisition via transfer learning and advice taking

ECML'06 Proceedings of the 17th European conference on Machine Learning
A sparse kernel-based least-squares temporal difference algorithm for reinforcement learning

ICNC'06 Proceedings of the Second international conference on Advances in Natural Computation - Volume Part I
Monte Carlo matrix inversion policy evaluation

UAI'03 Proceedings of the Nineteenth conference on Uncertainty in Artificial Intelligence
Global versus local constructive function approximation for on-line reinforcement learning

AI'05 Proceedings of the 18th Australian Joint conference on Advances in Artificial Intelligence
Valency for adaptive homeostatic agents: relating evolution and learning

ECAL'05 Proceedings of the 8th European conference on Advances in Artificial Life
Kernel-Based reinforcement learning

ICIC'06 Proceedings of the 2006 international conference on Intelligent Computing - Volume Part I
Grey reinforcement learning for incomplete information processing

TAMC'06 Proceedings of the Third international conference on Theory and Applications of Models of Computation
Reinforcement learning-based tuning algorithm applied to fuzzy identification

ISNN'06 Proceedings of the Third international conference on Advances in Neural Networks - Volume Part I
Expectancy, ambiguity, and behavioral flexibility: Separable and complementary roles of the orbital frontal cortex and amygdala in processing reward expectancies

Journal of Cognitive Neuroscience
Multiple overlapping tiles for contextual monte carlo tree search

EvoApplicatons'10 Proceedings of the 2010 international conference on Applications of Evolutionary Computation - Volume Part I
Feature-Discovering approximate value iteration methods

SARA'05 Proceedings of the 6th international conference on Abstraction, Reformulation and Approximation
CBR for state value function approximation in reinforcement learning

ICCBR'05 Proceedings of the 6th international conference on Case-Based Reasoning Research and Development
Evolving small-board Go players using coevolutionary temporal difference learning with archives

International Journal of Applied Mathematics and Computer Science
A reinforcement learning approach for host-based intrusion detection using sequences of system calls

ICIC'05 Proceedings of the 2005 international conference on Advances in Intelligent Computing - Volume Part I
On-Line reinforcement learning using cascade constructive neural networks

KES'05 Proceedings of the 9th international conference on Knowledge-Based Intelligent Information and Engineering Systems - Volume Part III
Error bounds in reinforcement learning policy evaluation

AI'05 Proceedings of the 18th Canadian Society conference on Advances in Artificial Intelligence
Some adaptive advantages of the ability to make predictions

SAB'06 Proceedings of the 9th international conference on From Animals to Animats: simulation of Adaptive Behavior
Stabilising hebbian learning with a third factor in a food retrieval task

SAB'06 Proceedings of the 9th international conference on From Animals to Animats: simulation of Adaptive Behavior
Time does not always buy quality in co-evolutionary learning

SETN'10 Proceedings of the 6th Hellenic conference on Artificial Intelligence: theories, models and applications
Feedback-related brain potential activity complies with basic assumptions of associative learning theory

Journal of Cognitive Neuroscience
Q-Learning and Enhanced Policy Iteration in Discounted Dynamic Programming

Mathematics of Operations Research
Reinforcement learning algorithm with CTRNN in continuous action space

ICONIP'06 Proceedings of the 13 international conference on Neural Information Processing - Volume Part I
Adaptive fraud detection using benford's law

AI'06 Proceedings of the 19th international conference on Advances in Artificial Intelligence: Canadian Society for Computational Studies of Intelligence
Improvement of air handling unit control performance using reinforcement learning

PKAW'06 Proceedings of the 9th Pacific Rim Knowledge Acquisition international conference on Advances in Knowledge Acquisition and Management
Enhanced temporal difference learning using compiled eligibility traces

AI'06 Proceedings of the 19th Australian joint conference on Artificial Intelligence: advances in Artificial Intelligence
Reducing the memory footprint of temporal difference learning over finitely many states by using case-based generalization

ICCBR'10 Proceedings of the 18th international conference on Case-Based Reasoning Research and Development
Actor-Critic algorithm based on incremental least-squares temporal difference with eligibility trace

ICIC'11 Proceedings of the 7th international conference on Advanced Intelligent Computing Theories and Applications: with aspects of artificial intelligence
Homeokinetic reinforcement learning

PSL'11 Proceedings of the First IAPR TC3 conference on Partially Supervised Learning
Multi-agent reinforcement learning for simulating pedestrian navigation

ALA'11 Proceedings of the 11th international conference on Adaptive and Learning Agents
Stochastic enforced hill-climbing

Journal of Artificial Intelligence Research
Brief paper: Average cost temporal-difference learning

Automatica (Journal of IFAC)
2012 Special Issue: Depth of treatment sensitive noise resistant dynamic artificial neural networks model of recall in people with prosopagnosia

Neural Networks
Using TD learning to simulate working memory performance in a model of the prefrontal cortex and basal ganglia

Cognitive Systems Research
An integrated approach for healthcare planning over multi-dimensional data using long-term prediction

HIS'12 Proceedings of the First international conference on Health Information Science
The successor representation and temporal context

Neural Computation
A comparison of adaptive critic and chemotaxis methods in adaptive control

Mathematical and Computer Modelling: An International Journal
The use of pontryagin estimators for on-line optimal control sequence estimation: The truck backer-upper case study

Mathematical and Computer Modelling: An International Journal
Forecasting of short-term traffic-flow based on improved neurofuzzy models via emotional temporal difference learning algorithm

Engineering Applications of Artificial Intelligence
Q-error as a selection mechanism in modular reinforcement-learning systems

IJCAI'11 Proceedings of the Twenty-Second international joint conference on Artificial Intelligence - Volume Volume Two
Temporal difference method-based multi-step ahead prediction of long term deep fading in mobile networks

Computer Communications
Value-function reinforcement learning in Markov games

Cognitive Systems Research
A dual association model for the extinction of animal conditioning

Neurocomputing
R(λ) imitation learning for automatic generation control of interconnected power grids

Automatica (Journal of IFAC)
Approximate stochastic annealing for online control of infinite horizon Markov decision processes

Automatica (Journal of IFAC)
An online kernel-based clustering approach for value function approximation

SETN'12 Proceedings of the 7th Hellenic conference on Artificial Intelligence: theories and applications
Reinforcement Learning with Approximation Spaces

Fundamenta Informaticae
Function optimisation by learning automata

Information Sciences: an International Journal
An automated signalized junction controller that learns strategies by temporal difference reinforcement learning

Engineering Applications of Artificial Intelligence
Autonomous shaping via coevolutionary selection of training experience

PPSN'12 Proceedings of the 12th international conference on Parallel Problem Solving from Nature - Volume Part II
Reinforcement learning with n-tuples on the game connect-4

PPSN'12 Proceedings of the 12th international conference on Parallel Problem Solving from Nature - Volume Part I
Enhancing learning capabilities by XCS with best action mapping

PPSN'12 Proceedings of the 12th international conference on Parallel Problem Solving from Nature - Volume Part I
Multi-agent task division learning in hide-and-seek games

AIMSA'12 Proceedings of the 15th international conference on Artificial Intelligence: methodology, systems, and applications
Optimization model selection for simulation-based approximate dynamic programming approaches in semiconductor manufacturing operations

Proceedings of the Winter Simulation Conference
Identifying effective policies in approximate dynamic programming: beyond regression

Proceedings of the Winter Simulation Conference
Phasic dopamine as a prediction error of intrinsic and extrinsic reinforcements driving both action acquisition and reward maximization: A simulated robotic study

Neural Networks
An Actor-Critic based controller for glucose regulation in type 1 diabetes

Computer Methods and Programs in Biomedicine
Abstraction in Model Based Partially Observable Reinforcement Learning Using Extended Sequence Trees

WI-IAT '12 Proceedings of the The 2012 IEEE/WIC/ACM International Joint Conferences on Web Intelligence and Intelligent Agent Technology - Volume 02
2013 Special Issue: A spiking neuron model of the cortico-basal ganglia circuits for goal-directed and habitual action learning

Neural Networks
An efficient L2-norm regularized least-squares temporal difference learning algorithm

Knowledge-Based Systems
Learning time series patterns by genetic programming

ACSC '12 Proceedings of the Thirty-fifth Australasian Computer Science Conference - Volume 122
Building a social multi-agent system simulation management toolbox

Proceedings of the 6th Balkan Conference in Informatics
Predicting intraday prices in stock market transactions using similarity profiled temporal associations

International Journal of Data Analysis Techniques and Strategies
A novel reinforcement learning architecture for continuous state and action spaces

Advances in Artificial Intelligence
Backward Q-learning: The combination of Sarsa algorithm and Q-learning

Engineering Applications of Artificial Intelligence
Efficient learning in linearly solvable MDP models

IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence
Reinforcement learning algorithms with function approximation: Recent advances and applications

Information Sciences: an International Journal
Learning potential functions and their representations for multi-task reinforcement learning

Autonomous Agents and Multi-Agent Systems
Intelligent Cooperative Control Architecture: A Framework for Performance Improvement Using Safe Learning

Journal of Intelligent and Robotic Systems
A survey of multi-objective sequential decision-making

Journal of Artificial Intelligence Research
Multi-timescale nexting in a reinforcement learning robot

Adaptive Behavior - Animals, Animats, Software Agents, Robots, Adaptive Systems

Quantified Score

Hi-index	0.03

Visualization

Abstract

This article introduces a class of incremental learning procedures specialized for prediction – that is, for using past experience with an incompletely known system to predict its future behavior. Whereas conventional prediction-learning methods assign credit by means of the difference between predicted and actual outcomes, the new methods assign credit by means of the difference between temporally successive predictions. Although such temporal-difference methods have been used in Samuel's checker player, Holland's bucket brigade, and the author's Adaptive Heuristic Critic, they have remained poorly understood. Here we prove their convergence and optimality for special cases and relate them to supervised-learning methods. For most real-world prediction problems, temporal-difference methods require less memory and less peak computation than conventional methods and they produce more accurate predictions. We argue that most problems to which supervised learning is currently applied are really prediction problems of the sort to which temporal-difference methods can be applied to advantage.