Dynamic programming: deterministic and stochastic models
Dynamic programming: deterministic and stochastic models
Random number generation and quasi-Monte Carlo methods
Random number generation and quasi-Monte Carlo methods
Learning in embedded systems
Numerical methods for stochastic control problems in continuous time
Numerical methods for stochastic control problems in continuous time
Simplicial mesh generation with applications
Simplicial mesh generation with applications
Learning to act using real-time dynamic programming
Artificial Intelligence - Special volume on computational research on interaction and agency, part 1
Temporal difference learning and TD-Gammon
Communications of the ACM
Rates of Convergence for Approximation Schemes in Optimal Control
SIAM Journal on Control and Optimization
Gradient descent for general reinforcement learning
Proceedings of the 1998 conference on Advances in neural information processing systems II
Barycentric interpolators for continuous space & time reinforcement learning
Proceedings of the 1998 conference on Advances in neural information processing systems II
An Algorithm for Finding Best Matches in Logarithmic Expected Time
ACM Transactions on Mathematical Software (TOMS)
Markov Decision Processes: Discrete Stochastic Dynamic Programming
Markov Decision Processes: Discrete Stochastic Dynamic Programming
Neuro-Dynamic Programming
Rates of Convergence for Variable Resolution Schemes in Optimal Control
ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Approximate solutions to markov decision processes
Approximate solutions to markov decision processes
Lyapunov design for safe reinforcement learning
The Journal of Machine Learning Research
P3VI: a partitioned, prioritized, parallel value iterator
ICML '04 Proceedings of the twenty-first international conference on Machine learning
Dynamic programming for structured continuous Markov decision problems
UAI '04 Proceedings of the 20th conference on Uncertainty in artificial intelligence
Interactive learning of mappings from visual percepts to actions
ICML '05 Proceedings of the 22nd international conference on Machine learning
Incremental Learning of Linear Model Trees
Machine Learning
Automatic basis function construction for approximate dynamic programming and reinforcement learning
ICML '06 Proceedings of the 23rd international conference on Machine learning
Adaptive spline interpolation for Hamilton-Jacobi-Bellman equations
Applied Numerical Mathematics - Numerical methods for viscosity solutions and applications
Asset pricing with dynamic programming
Computational Economics
Proceedings of the 24th international conference on Machine learning
Application of SONQL for real-time learning of robot behaviors
Robotics and Autonomous Systems
Continuous State Dynamic Programming via Nonexpansive Approximation
Computational Economics
Accelerating autonomous learning by using heuristic selection of actions
Journal of Heuristics
Reinforcement Learning in Complex Environments Through Multiple Adaptive Partitions
AI*IA '07 Proceedings of the 10th Congress of the Italian Association for Artificial Intelligence on AI*IA 2007: Artificial Intelligence and Human-Oriented Computing
Multigrid Reinforcement Learning with Reward Shaping
ICANN '08 Proceedings of the 18th international conference on Artificial Neural Networks, Part I
Two Steps Reinforcement Learning in Continuous Reinforcement Learning Tasks
IWANN '09 Proceedings of the 10th International Work-Conference on Artificial Neural Networks: Part I: Bio-Inspired Systems: Computational and Ambient Intelligence
Motion Planning of a Non-holonomic Vehicle in a Real Environment by Reinforcement Learning*
IWANN '09 Proceedings of the 10th International Work-Conference on Artificial Neural Networks: Part I: Bio-Inspired Systems: Computational and Ambient Intelligence
International Journal of Robotics Research
ACM SIGGRAPH Asia 2009 papers
Lazy approximation for solving continuous finite-horizon MDPs
AAAI'05 Proceedings of the 20th national conference on Artificial intelligence - Volume 3
Finding approximate POMDP solutions through belief compression
Journal of Artificial Intelligence Research
Solving factored MDPs with hybrid state and action variables
Journal of Artificial Intelligence Research
Closed-loop learning of visual control policies
Journal of Artificial Intelligence Research
A heuristic search approach to planning with continuous resources in stochastic domains
Journal of Artificial Intelligence Research
Adaptive Fuzzy Function Approximation for Multi-agent Reinforcement Learning
WI-IAT '09 Proceedings of the 2009 IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology - Volume 02
Domain-independent, automatic partitioning for probabilistic planning
IJCAI'09 Proceedings of the 21st international jont conference on Artifical intelligence
Approximate dynamic programming with a fuzzy parameterization
Automatica (Journal of IFAC)
Finding and transferring policies using stored behaviors
Autonomous Robots
Case-Based Multiagent Reinforcement Learning: Cases as Heuristics for Selection of Actions
Proceedings of the 2010 conference on ECAI 2010: 19th European Conference on Artificial Intelligence
Gaussian processes for sample efficient reinforcement learning with RMAX-like exploration
ECML PKDD'10 Proceedings of the 2010 European conference on Machine learning and knowledge discovery in databases: Part I
Continuous-state reinforcement learning with fuzzy approximation
ALAMAS'05/ALAMAS'06/ALAMAS'07 Proceedings of the 5th , 6th and 7th European conference on Adaptive and learning agents and multi-agent systems: adaptation and multi-agent learning
Planning in stochastic domains for multiple agents with individual continuous resource state-spaces
Autonomous Agents and Multi-Agent Systems
A geometric approach to find nondominated policies to imprecise reward MDPs
ECML PKDD'11 Proceedings of the 2011 European conference on Machine learning and knowledge discovery in databases - Volume Part I
Task-Driven discretization of the joint space of visual percepts and continuous actions
ECML'06 Proceedings of the 17th European conference on Machine Learning
When do differences matter? On-line feature extraction through cognitive economy
Cognitive Systems Research
Reinforcement Learning with Reward Shaping and Mixed Resolution Function Approximation
International Journal of Agent Technologies and Systems
Deconstructing reinforcement learning in sigma
AGI'12 Proceedings of the 5th international conference on Artificial General Intelligence
Adaptive function approximation in reinforcement learning with an interpolating growing neural gas
International Journal of Hybrid Intelligent Systems
Hi-index | 0.00 |
The problem of state abstraction is of central importance in optimal control, reinforcement learning and Markov decision processes. This paper studies the case of variable resolution state abstraction for continuous time and space, deterministic dynamic control problems in which near-optimal policies are required. We begin by defining a class of variable resolution policy and value function representations based on Kuhn triangulations embedded in a kd-trie. We then consider top-down approaches to choosing which cells to split in order to generate improved policies. The core of this paper is the introduction and evaluation of a wide variety of possible splitting criteria. We begin with local approaches based on value function and policy properties that use only features of individual cells in making split choices. Later, by introducing two new non-local measures, influence and variance, we derive splitting criteria that allow one cell to efficiently take into account its impact on other cells when deciding whether to split. Influence is an efficiently-calculable measure of the extent to which changes in some state effect the value function of some other states. Variance is an efficiently-calculable measure of how risky is some state in a Markov chain: a low variance state is one in which we would be very surprised if, during any one execution, the long-term reward attained from that state differed substantially from its expected value, given by the value function.The paper proceeds by graphically demonstrating the various approaches to splitting on the familiar, non-linear, non-minimum phase, and two dimensional problem of the “Car on the hill”. It then evaluates the performance of a variety of splitting criteria on many benchmark problems, paying careful attention to their number-of-cells versus closeness-to-optimality tradeoff curves.