A heuristic search algorithm with modifiable estimate
Artificial Intelligence
Dynamic programming: deterministic and stochastic models
Dynamic programming: deterministic and stochastic models
IEEE Transactions on Systems, Man and Cybernetics
Parallel and distributed computation: numerical methods
Parallel and distributed computation: numerical methods
The CDP: A unifying formulation for heuristic search, dynamic programming, and branch-and-bound
Search in Artificial Intelligence
AI Magazine
In defense of reaction plans as caches
AI Magazine
Artificial Intelligence
Proceedings of the seventh international conference (1990) on Machine learning
Learning to control an unstable system with forward modeling
Advances in neural information processing systems 2
Sequential decision problems and neural networks
Advances in neural information processing systems 2
Planning and control
Self-improving reactive agents: case studies of reinforcement learning frameworks
Proceedings of the first international conference on simulation of adaptive behavior on From animals to animats
Navigating through temporal difference
NIPS-3 Proceedings of the 1990 conference on Advances in neural information processing systems 3
Note on learning rate schedules for stochastic optimization
NIPS-3 Proceedings of the 1990 conference on Advances in neural information processing systems 3
Automatic programming of behavior-based robots using reinforcement learning
Artificial Intelligence
Practical Issues in Temporal Difference Learning
Machine Learning
Technical Note: \cal Q-Learning
Machine Learning
The Convergence of TD(λ) for General λ
Machine Learning
Numerical methods for stochastic control problems in continuous time
Numerical methods for stochastic control problems in continuous time
Efficient learning and planning within the Dyna framework
Adaptive Behavior
An Upper Bound on the Loss from Approximate Optimal-Value Functions
Machine Learning
Connectionistic Problem-Solving
Connectionistic Problem-Solving
Introduction to Stochastic Dynamic Programming: Probability and Mathematical
Introduction to Stochastic Dynamic Programming: Probability and Mathematical
Learning to Predict by the Methods of Temporal Differences
Machine Learning
Memory-Based Reinforcement Learning: Efficient Computation with Prioritized Sweeping
Advances in Neural Information Processing Systems 5, [NIPS Conference]
Reinforcement Learning Applied to Linear Quadratic Regulation
Advances in Neural Information Processing Systems 5, [NIPS Conference]
Abstraction in Control Learning
Abstraction in Control Learning
Temporal credit assignment in reinforcement learning
Temporal credit assignment in reinforcement learning
Universal plans for reactive robots in unpredictable environments
IJCAI'87 Proceedings of the 10th international joint conference on Artificial intelligence - Volume 2
Input generalization in delayed reinforcement learning: an algorithm and performance comparisons
IJCAI'91 Proceedings of the 12th international joint conference on Artificial intelligence - Volume 2
Some studies in machine learning using the game of checkers
IBM Journal of Research and Development
Some studies in machine learning using the game of checkers. II: recent progress
IBM Journal of Research and Development
Two kinds of training information for evaluation function learning
AAAI'91 Proceedings of the ninth National conference on Artificial intelligence - Volume 2
Programming robots using reinforcement learning and teaching
AAAI'91 Proceedings of the ninth National conference on Artificial intelligence - Volume 2
Solving very large weakly coupled Markov decision processes
AAAI '98/IAAI '98 Proceedings of the fifteenth national/tenth conference on Artificial intelligence/Innovative applications of artificial intelligence
Heuristic search in cyclic AND/OR graphs
AAAI '98/IAAI '98 Proceedings of the fifteenth national/tenth conference on Artificial intelligence/Innovative applications of artificial intelligence
Value-update rules for real-time search
AAAI '99/IAAI '99 Proceedings of the sixteenth national conference on Artificial intelligence and the eleventh Innovative applications of artificial intelligence conference innovative applications of artificial intelligence
Minimax TD-Learning with Neural Nets in a Markov Game
ECML '00 Proceedings of the 11th European Conference on Machine Learning
Propagation of Q-values in Tabular TD(lambda)
ECML '02 Proceedings of the 13th European Conference on Machine Learning
Module Based Reinforcement Learning: An Application to a Real Robot
EWLR-6 Proceedings of the 6th European Workshop on Learning Robots
Learning a Navigation Task in Changing Environments by Multi-task Reinforcement Learning
EWLR-8 Proceedings of the 8th European Workshop on Learning Robots: Advances in Robot Learning
Modelling Intelligent Behaviour: The Markov Decision Process Approach
IBERAMIA '98 Proceedings of the 6th Ibero-American Conference on AI: Progress in Artificial Intelligence
Cognition, Sociability, and Constraints
Balancing Reactivity and Social Deliberation in Multi-Agent Systems, From RoboCup to Real-World Applications (selected papers from the ECAI 2000 Workshop and additional contributions)
Karlsruhe Brainstormers - A Reinforcement Learning Approach to Robotic Soccer
RoboCup 2000: Robot Soccer World Cup IV
Karlsruhe Brainstormers - Design Principles
RoboCup-99: Robot Soccer World Cup III
Karlsruhe Brainstormers 2000 Team Description
RoboCup 2000: Robot Soccer World Cup IV
Distributed Learning and Control for Manufacturing Systems Scheduling
Proceedings of the 14th International conference on Industrial and engineering applications of artificial intelligence and expert systems: engineering of intelligent systems
θ-Subsumption Based on Object Context
Inductive Logic Programming
R-FRTDP: A Real-Time DP Algorithm with Tight Bounds for a Stochastic Resource Allocation Problem
CAI '07 Proceedings of the 20th conference of the Canadian Society for Computational Studies of Intelligence on Advances in Artificial Intelligence
Epoch-Incremental Queue-Dyna Algorithm
ICAISC '08 Proceedings of the 9th international conference on Artificial Intelligence and Soft Computing
Basal Ganglia Models for Autonomous Behavior Learning
Creating Brain-Like Intelligence
A dynamical systems perspective on agent-environment interaction
Artificial Intelligence
A Survey of Motion Planning Algorithms from the Perspective of Autonomous UAV Guidance
Journal of Intelligent and Robotic Systems
ADT '09 Proceedings of the 1st International Conference on Algorithmic Decision Theory
Reinforcement Learning Based Web Service Compositions for Mobile Business
WISM '09 Proceedings of the International Conference on Web Information Systems and Mining
Robust adaptive Markov decision processes in multi-vehicle applications
ACC'09 Proceedings of the 2009 conference on American Control Conference
Reinforcement Learning in Finite MDPs: PAC Analysis
The Journal of Machine Learning Research
Constraint-based agents: an architecture for constraint-based modeling and local-search-based reasoning for planning and scheduling in open and dynamic worlds
Amsaa: a multistep anticipatory algorithm for online stochastic combinatorial optimization
CPAIOR'08 Proceedings of the 5th international conference on Integration of AI and OR techniques in constraint programming for combinatorial optimization problems
Deterministic POMDPs revisited
UAI '09 Proceedings of the Twenty-Fifth Conference on Uncertainty in Artificial Intelligence
Monotonicity of constrained optimal transmission policies in correlated fading channels with ARQ
IEEE Transactions on Signal Processing
On-line learning and optimization for wireless video transmission
IEEE Transactions on Signal Processing
IEEE Journal on Selected Areas in Communications
Learning-based robot vision: principles and applications
Learning-based robot vision: principles and applications
Using training regimens to teach expanding function approximators
Proceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems: volume 1 - Volume 1
PAC-MDP learning with knowledge-based admissible models
Proceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems: volume 1 - Volume 1
ACM Transactions on Modeling and Computer Simulation (TOMACS)
A robust and fast action selection mechanism for planning
AAAI'97/IAAI'97 Proceedings of the fourteenth national conference on artificial intelligence and ninth conference on Innovative applications of artificial intelligence
Speeding safely: multi-criteria optimization in probabilistic planning
AAAI'97/IAAI'97 Proceedings of the fourteenth national conference on artificial intelligence and ninth conference on Innovative applications of artificial intelligence
Easy and hard testbeds for real-time search algorithms
AAAI'96 Proceedings of the thirteenth national conference on Artificial intelligence - Volume 1
Improving the learning efficiencies of realtime search
AAAI'96 Proceedings of the thirteenth national conference on Artificial intelligence - Volume 1
Auto-exploratory average reward reinforcement learning
AAAI'96 Proceedings of the thirteenth national conference on Artificial intelligence - Volume 1
Ranking policies in discrete Markov decision processes
Annals of Mathematics and Artificial Intelligence
Case-based subgoaling in real-time heuristic search for video game pathfinding
Journal of Artificial Intelligence Research
Anytime state-based solution methods for decision processes with non-Markovian rewards
UAI'02 Proceedings of the Eighteenth conference on Uncertainty in artificial intelligence
UAI'01 Proceedings of the Seventeenth conference on Uncertainty in artificial intelligence
On the complexity of solving Markov decision problems
UAI'95 Proceedings of the Eleventh conference on Uncertainty in artificial intelligence
Correlated action effects in decision theoretic regression
UAI'97 Proceedings of the Thirteenth conference on Uncertainty in artificial intelligence
Patching approximate solutions in reinforcement learning
ECML'06 Proceedings of the 17th European conference on Machine Learning
Solving uncertain markov decision problems: an interval-based method
ICNC'06 Proceedings of the Second international conference on Advances in Natural Computation - Volume Part II
Symbolic generalization for on-line planning
UAI'03 Proceedings of the Nineteenth conference on Uncertainty in Artificial Intelligence
Sequentially optimal repeated coalition formation under uncertainty
Autonomous Agents and Multi-Agent Systems
Improvement of air handling unit control performance using reinforcement learning
PKAW'06 Proceedings of the 9th Pacific Rim Knowledge Acquisition international conference on Advances in Knowledge Acquisition and Management
Expert Systems with Applications: An International Journal
Topological value iteration algorithms
Journal of Artificial Intelligence Research
Stochastic enforced hill-climbing
Journal of Artificial Intelligence Research
Goal recognition over POMDPs: inferring the intention of a POMDP agent
IJCAI'11 Proceedings of the Twenty-Second international joint conference on Artificial Intelligence - Volume Volume Three
Discovering hidden structure in factored MDPs
Artificial Intelligence
Optimized look-ahead tree search policies
EWRL'11 Proceedings of the 9th European conference on Recent Advances in Reinforcement Learning
Integrating a partial model into model free reinforcement learning
The Journal of Machine Learning Research
Proximity-based non-uniform abstractions for approximate planning
Journal of Artificial Intelligence Research
Avoiding and escaping depressions in real-time heuristic search
Journal of Artificial Intelligence Research
ECML PKDD'12 Proceedings of the 2012 European conference on Machine Learning and Knowledge Discovery in Databases - Volume Part II
A survey of point-based POMDP solvers
Autonomous Agents and Multi-Agent Systems
Cognitive Robotics and Multiagency in a Fuzzy Modeling Framework
International Journal of Agent Technologies and Systems
Hybrid POMDP based evolutionary adaptive framework for efficient visual tracking algorithms
Proceedings of the 15th annual conference on Genetic and evolutionary computation
Light at the end of the tunnel: a Monte Carlo approach to computing value of information
Proceedings of the 2013 international conference on Autonomous agents and multi-agent systems
Embodied imitation-enhanced reinforcement learning in multi-agent systems
Adaptive Behavior - Animals, Animats, Software Agents, Robots, Adaptive Systems
Hi-index | 0.01 |
Learning methods based on dynamic programming (DP) are receiving increasing attention in artificial intelligence. Researchers have argued that DP provides the appropriate basis for compiling planning results into reactive strategies for real-time control, as well as for learning such strategies when the system being controlled is incompletely known. We introduce an algorithm based on DP, which we call Real-Time DP (RTDP), by which an embedded system can improve its performance with experience. RTDP generalizes Korf's Learning-Real-Time-A^* algorithm to problems involving uncertainty. We invoke results from the theory of asynchronous DP to prove that RTDP achieves optimal behavior in several different classes of problems. We also use the theory of asynchronous DP to illuminate aspects of other DP-based reinforcement learning methods such as Watkins' Q-Learning algorithm. A secondary aim of this article is to provide a bridge between AI research on real-time planning and learning and relevant concepts and algorithms from control theory.