Efficient sampling in approximate dynamic programming algorithms

Authors:
Cristiano Cervellera;Marco Muselli
Affiliations:
Istituto di Studi sui Sistemi Intelligenti per l'Automazione, Consiglio Nazionale delle Ricerche, Genova, Italy 16149;Istituto di Elettronica e di Ingegneria dell'Informazione e delle Telecomunicazioni, Consiglio Nazionale delle Ricerche, Genova, Italy 16149
Venue:
Computational Optimization and Applications
Year:
2007

Citing 10
Cited 4

Random number generation and quasi-Monte Carlo methods

Random number generation and quasi-Monte Carlo methods
Numerical solution of continuous-state dynamic programs using linear and spline interpolation

Operations Research
Programs to generate Niederreiter's low-discrepancy sequences

ACM Transactions on Mathematical Software (TOMS)
Dynamic Programming and Optimal Control

Dynamic Programming and Optimal Control
Neuro-Dynamic Programming

Neuro-Dynamic Programming
Approximating networks and extended Ritz method for the solution of functional optimization problems

Journal of Optimization Theory and Applications
Applying Experimental Design and Regression Splines to High-Dimensional Continuous-State Stochastic Dynamic Programming

Operations Research
Dynamic Programming

Dynamic Programming
On the relationship between generalization error, hypothesis complexity, and sample complexity for radial basis functions

Neural Computation
Deterministic design for neural network learning: an approach based on discrepancy

IEEE Transactions on Neural Networks

Lattice point sets for deterministic learning and approximate optimization problems

IEEE Transactions on Neural Networks
Management of water resource systems in the presence of uncertainties by nonlinear approximation techniques and deterministic sampling

Computational Optimization and Applications
Design, optimization and performance evaluation of a content distribution overlay for streaming

Computer Communications
Low-discrepancy sampling for approximate dynamic programming with local approximators

Computers and Operations Research

Quantified Score

Hi-index	0.00

Visualization

Abstract

Dynamic Programming (DP) is known to be a standard optimization tool for solving Stochastic Optimal Control (SOC) problems, either over a finite or an infinite horizon of stages. Under very general assumptions, commonly employed numerical algorithms are based on approximations of the cost-to-go functions, by means of suitable parametric models built from a set of sampling points in the d-dimensional state space. Here the problem of sample complexity, i.e., how "fast" the number of points must grow with the input dimension in order to have an accurate estimate of the cost-to-go functions in typical DP approaches such as value iteration and policy iteration, is discussed. It is shown that a choice of the sampling based on low-discrepancy sequences, commonly used for efficient numerical integration, permits to achieve, under suitable hypotheses, an almost linear sample complexity, thus contributing to mitigate the curse of dimensionality of the approximate DP procedure.