Greedy linear value-approximation for factored Markov decision processes
Eighteenth national conference on Artificial intelligence
Solving factored MDPs with continuous and discrete variables
UAI '04 Proceedings of the 20th conference on Uncertainty in artificial intelligence
On Convergence Conditions of an Extended Projection Neural Network
Neural Computation
Resource allocation among agents with preferences induced by factored MDPs
AAMAS '06 Proceedings of the fifth international joint conference on Autonomous agents and multiagent systems
Approximate dynamic programming in multi-skill call centers
WSC '05 Proceedings of the 37th conference on Winter simulation
Mathematics of Operations Research
Approximate Solutions of a Dynamic Forecast-Inventory Model
Manufacturing & Service Operations Management
Response to Comments on Brandão et al. (2005)
Decision Analysis
Performance Loss Bounds for Approximate Value Iteration with State Aggregation
Mathematics of Operations Research
A Price-Directed Approach to Stochastic Inventory/Routing
Operations Research
A New Learning Algorithm for Optimal Stopping
Discrete Event Dynamic Systems
Practical solution techniques for first-order MDPs
Artificial Intelligence
Reinforcement Learning: A Tutorial Survey and Recent Advances
INFORMS Journal on Computing
Constraint relaxation in approximate linear programs
ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Partially Observable Markov Decision Process Approximations for Adaptive Sensing
Discrete Event Dynamic Systems
Learning basis functions in hybrid domains
AAAI'06 proceedings of the 21st national conference on Artificial intelligence - Volume 2
Error bounds for approximate value iteration
AAAI'05 Proceedings of the 20th national conference on Artificial intelligence - Volume 2
Towards exploiting duality in approximate linear programming for MDPs
AAAI'05 Proceedings of the 20th national conference on Artificial intelligence - Volume 4
Efficient solution algorithms for factored MDPs
Journal of Artificial Intelligence Research
Solving factored MDPs with hybrid state and action variables
Journal of Artificial Intelligence Research
Using linear programming for Bayesian exploration in Markov decision processes
IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
An MCMC approach to solving hybrid factored MDPs
IJCAI'05 Proceedings of the 19th international joint conference on Artificial intelligence
Average-delay optimal policies for the point-to-point channel
WiOPT'09 Proceedings of the 7th international conference on Modeling and Optimization in Mobile, Ad Hoc, and Wireless Networks
Commentary---Perspectives on Stochastic Optimization Over Time
INFORMS Journal on Computing
Probability in the Engineering and Informational Sciences
Computing and using lower and upper bounds for action elimination in MDP planning
SARA'07 Proceedings of the 7th International conference on Abstraction, reformulation, and approximation
Reinforcement learning algorithms based on mGA and EA with policy iterations
LSMS'07 Proceedings of the Life system modeling and simulation 2007 international conference on Bio-Inspired computational intelligence and applications
Approximate Dynamic Programming for Ambulance Redeployment
INFORMS Journal on Computing
Incremental plan aggregation for generating policies in MDPs
Proceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems: volume 1 - Volume 1
Computational Methods for Oblivious Equilibrium
Operations Research
Information Relaxations and Duality in Stochastic Dynamic Programs
Operations Research
ACM Transactions on Modeling and Computer Simulation (TOMACS)
Stochastic control via direct comparison
Discrete Event Dynamic Systems
Creating long gait animation sequences through Reinforcement Learning
Proceedings of the 2011 conference on Neural Nets WIRN10: Proceedings of the 20th Italian Workshop on Neural Nets
An Improved Dynamic Programming Decomposition Approach for Network Revenue Management
Manufacturing & Service Operations Management
Adaptive modulation with smoothed flow utility
EURASIP Journal on Wireless Communications and Networking
Network Cargo Capacity Management
Operations Research
A framework and a mean-field algorithm for the local control of spatial processes
International Journal of Approximate Reasoning
Robust Approximate Bilinear Programming for Value Function Approximation
The Journal of Machine Learning Research
Approximate Dynamic Programming via a Smoothed Linear Program
Operations Research
Reinforcement learning with a bilinear q function
EWRL'11 Proceedings of the 9th European conference on Recent Advances in Reinforcement Learning
Robust bayesian reinforcement learning through tight lower bounds
EWRL'11 Proceedings of the 9th European conference on Recent Advances in Reinforcement Learning
Rational market making with probabilistic knowledge
Proceedings of the 11th International Conference on Autonomous Agents and Multiagent Systems - Volume 2
Pathwise Optimization for Optimal Stopping Problems
Management Science
Dynamic Capacity Allocation to Customers Who Remember Past Service
Management Science
Control design for specifications on stochastic hybrid systems
Proceedings of the 16th international conference on Hybrid systems: computation and control
Approximate Linear Programming for Average Cost MDPs
Mathematics of Operations Research
Optimally solving dec-POMDPs as continuous-state MDPs
IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence
Construction of approximation spaces for reinforcement learning
The Journal of Machine Learning Research
Hi-index | 0.00 |
The curse of dimensionality gives rise to prohibitive computational requirements that render infeasible the exact solution of large-scale stochastic control problems. We study an efficient method based on linear programming for approximating solutions to such problems. The approach "fits" a linear combination of pre-selected basis functions to the dynamic programming cost-to-go function. We develop error bounds that offer performance guarantees and also guide the selection of both basis functions and "state-relevance weights" that influence quality of the approximation. Experimental results in the domain of queueing network control provide empirical support for the methodology.