Feature-based methods for large scale dynamic programming

Authors:
John N. Tsitsiklis;Benjamin van Roy
Affiliations:
-;-
Venue:
Machine Learning - Special issue on reinforcement learning
Year:
1996

Citing 0
Cited 67

Mean-field theory for batched TD (&lgr;)

Neural Computation
Module-Based Reinforcement Learning: Experiments with a Real Robot

Machine Learning - Special issue on learning in autonomous robots
Convergence analysis of temporal-difference learning algorithms with linear function approximation

COLT '99 Proceedings of the twelfth annual conference on Computational learning theory
Module-Based Reinforcement Learning: Experiments with a Real Robot

Autonomous Robots
The Relations Among Potentials, Perturbation Analysis,and Markov Decision Processes

Discrete Event Dynamic Systems
Embedding a Priori Knowledge in Reinforcement Learning

Journal of Intelligent and Robotic Systems
Kernel-Based Reinforcement Learning

Machine Learning
Near-Optimal Reinforcement Learning in Polynomial Time

Machine Learning
Reinforcement Learning Agents

Artificial Intelligence Review
Module Based Reinforcement Learning: An Application to a Real Robot

EWLR-6 Proceedings of the 6th European Workshop on Learning Robots
On the Asymptotic Behaviour of a Constant Stepsize Temporal-Difference Learning Algorithm

EuroCOLT '99 Proceedings of the 4th European Conference on Computational Learning Theory
Piecewise linear value function approximation for factored MDPs

Eighteenth national conference on Artificial intelligence
Efficient max-norm distance computation and reliable voxelization

Proceedings of the 2003 Eurographics/ACM SIGGRAPH symposium on Geometry processing
Interpolation-based Q-learning

ICML '04 Proceedings of the twenty-first international conference on Machine learning
Counter example for Q-bucket-brigade under prediction problem

GECCO '05 Proceedings of the 7th annual workshop on Genetic and evolutionary computation
Finite time bounds for sampling based fitted value iteration

ICML '05 Proceedings of the 22nd international conference on Machine learning
A Reinforcement Learning Algorithm in Cooperative Multi-Robot Domains

Journal of Intelligent and Robotic Systems
On the relationship between MDPs and the BDI architecture

AAMAS '06 Proceedings of the fifth international joint conference on Autonomous agents and multiagent systems
A Cost-Shaping Linear Program for Average-Cost Approximate Dynamic Programming with Performance Guarantees

Mathematics of Operations Research
Performance Loss Bounds for Approximate Value Iteration with State Aggregation

Mathematics of Operations Research
Restricted gradient-descent algorithm for value-function approximation in reinforcement learning

Artificial Intelligence
Continuous State Dynamic Programming via Nonexpansive Approximation

Computational Economics
The optimizing-simulator: merging simulation and optimization using approximate dynamic programming

Proceedings of the 39th conference on Winter simulation: 40 years! The best is yet to come
Brief paper: Policy iteration based feedback control

Automatica (Journal of IFAC)
Learning near-optimal policies with Bellman-residual minimization based fitted policy iteration and a single sample path

Machine Learning
Water reservoir control under economic, social and environmental constraints

Automatica (Journal of IFAC)
An analysis of reinforcement learning with function approximation

Proceedings of the 25th international conference on Machine learning
Finite-Time Bounds for Fitted Value Iteration

The Journal of Machine Learning Research
Model-Based Reinforcement Learning in a Complex Domain

RoboCup 2007: Robot Soccer World Cup XI
Practical solution techniques for first-order MDPs

Artificial Intelligence
Approximate dynamic programming: lessons from the field

Proceedings of the 40th Conference on Winter Simulation
Reinforcement Learning: A Tutorial Survey and Recent Advances

INFORMS Journal on Computing
The optimizing-simulator: An illustration using the military airlift problem

ACM Transactions on Modeling and Computer Simulation (TOMACS)
Value-function approximations for partially observable Markov decision processes

Journal of Artificial Intelligence Research
Nonapproximability results for partially observable Markov decision processes

Journal of Artificial Intelligence Research
Efficient solution algorithms for factored MDPs

Journal of Artificial Intelligence Research
Approximate policy iteration with a policy language bias: solving relational Markov decision processes

Journal of Artificial Intelligence Research
Monte Carlo sampling methods for approximating interactive POMDPs

Journal of Artificial Intelligence Research
Reinforcement learning: a survey

Journal of Artificial Intelligence Research
Computing factored value functions for policies in structured MDPs

IJCAI'99 Proceedings of the 16th international joint conference on Artificial intelligence - Volume 2
Max-norm projections for factored MDPs

IJCAI'01 Proceedings of the 17th international joint conference on Artificial intelligence - Volume 1
Solving factored MDPs via non-homogeneous partitioning

IJCAI'01 Proceedings of the 17th international joint conference on Artificial intelligence - Volume 1
RL-Based Memory Controller for Scalable Autonomous Systems

ICONIP '09 Proceedings of the 16th International Conference on Neural Information Processing: Part II
Architecture of behavior-based and robotics self-optimizing memory controller

ICRA'09 Proceedings of the 2009 IEEE international conference on Robotics and Automation
Constructing action set from basis functions for reinforcement learning of robot control

ICRA'09 Proceedings of the 2009 IEEE international conference on Robotics and Automation
On the evolution of artificial Tetris players

CIG'09 Proceedings of the 5th international conference on Computational Intelligence and Games
Feature Article---Merging AI and OR to Solve High-Dimensional Stochastic Optimization Problems Using Approximate Dynamic Programming

INFORMS Journal on Computing
Approximate dynamic programming techniques for the control of time-varying queuing systems applied to call centers with abandonments and retrials

Probability in the Engineering and Informational Sciences
Q-learning with linear function approximation

COLT'07 Proceedings of the 20th annual conference on Learning theory
Computing and using lower and upper bounds for action elimination in MDP planning

SARA'07 Proceedings of the 7th International conference on Abstraction, reformulation, and approximation
Approximate dynamic programming with a fuzzy parameterization

Automatica (Journal of IFAC)
Coordinated learning in multiagent MDPs with infinite state-space

Autonomous Agents and Multi-Agent Systems
Model minimization in Markov decision processes

AAAI'97/IAAI'97 Proceedings of the fourteenth national conference on artificial intelligence and ninth conference on Innovative applications of artificial intelligence
On the characteristics of sequential decision problems and their impact on evolutionary computation and reinforcement learning

EA'09 Proceedings of the 9th international conference on Artificial evolution
Social conformity and its convergence for reinforcement learning

MATES'10 Proceedings of the 8th German conference on Multiagent system technologies
Continuous-state reinforcement learning with fuzzy approximation

ALAMAS'05/ALAMAS'06/ALAMAS'07 Proceedings of the 5th , 6th and 7th European conference on Adaptive and learning agents and multi-agent systems: adaptation and multi-agent learning
Cost-based query answering in action probabilistic logic programs

SUM'10 Proceedings of the 4th international conference on Scalable uncertainty management
Correlated action effects in decision theoretic regression

UAI'97 Proceedings of the Thirteenth conference on Uncertainty in artificial intelligence
Learning near-optimal policies with bellman-residual minimization based fitted policy iteration and a single sample path

COLT'06 Proceedings of the 19th annual conference on Learning Theory
Kernel-Based reinforcement learning

ICIC'06 Proceedings of the 2006 international conference on Intelligent Computing - Volume Part I
Minimax search and reinforcement learning for adversarial tetris

SETN'10 Proceedings of the 6th Hellenic conference on Artificial Intelligence: theories, models and applications
Q-Learning and Enhanced Policy Iteration in Discounted Dynamic Programming

Mathematics of Operations Research
Event-learning and robust policy heuristics

Cognitive Systems Research
Reducing the learning time of tetris in evolution strategies

EA'11 Proceedings of the 10th international conference on Artificial Evolution
DCOB: Action space for reinforcement learning of high DoF robots

Autonomous Robots
Parallel Abductive Query Answering in Probabilistic Logic Programs

ACM Transactions on Computational Logic (TOCL)
Policy oscillation is overshooting

Neural Networks

Quantified Score

Hi-index	0.01

Visualization

Abstract