Adaptive value function approximation for continuous-state stochastic dynamic programming

Authors:
Huiyuan Fan;Prashant K. Tarun;Victoria C. P. Chen
Affiliations:
Rolls-Royce Energy Systems Inc, Mount Vernon, OH 43050, USA;Steven L. Craig School of Business, Missouri Western State University, St. Joseph, MO 64507, USA;Industrial and Manufacturing Systems Engineering, University of Texas at Arlington, Arlington, TX 76019, USA
Venue:
Computers and Operations Research
Year:
2013

Citing 17
Cited 0

Connectionist nonparametric regression: multilayer feedforward networks can learn arbitrary mappings

Neural Networks
Random number generation and quasi-Monte Carlo methods

Random number generation and quasi-Monte Carlo methods
Neural networks and the bias/variance dilemma

Neural Computation
Numerical solution of continuous-state dynamic programs using linear and spline interpolation

Operations Research
Backpropagation: basics and new developments

The handbook of brain theory and neural networks
Application of orthogonal arrays and MARS to inventory forecasting stochastic dynamic programs

Computational Statistics & Data Analysis
Dynamic Programming and Optimal Control

Dynamic Programming and Optimal Control
Neural Networks: A Comprehensive Foundation

Neural Networks: A Comprehensive Foundation
Introduction to Reinforcement Learning

Introduction to Reinforcement Learning
Neuro-Dynamic Programming

Neuro-Dynamic Programming
Applying Experimental Design and Regression Splines to High-Dimensional Continuous-State Stochastic Dynamic Programming

Operations Research
Handbook of Learning and Approximate Dynamic Programming (IEEE Press Series on Computational Intelligence)

Handbook of Learning and Approximate Dynamic Programming (IEEE Press Series on Computational Intelligence)
Sequential frameworks for statistics-based value function representation in approximate dynamic programming

Sequential frameworks for statistics-based value function representation in approximate dynamic programming
Reinforcement learning: a survey

Journal of Artificial Intelligence Research
Management of water resource systems in the presence of uncertainties by nonlinear approximation techniques and deterministic sampling

Computational Optimization and Applications
Functional Optimization Through Semilocal Approximate Minimization

Operations Research
Universal approximation bounds for superpositions of a sigmoidal function

IEEE Transactions on Information Theory

Quantified Score

Hi-index	0.01

Visualization

Abstract

Approximate dynamic programming (ADP) commonly employs value function approximation to numerically solve complex dynamic programming problems. A statistical perspective of value function approximation employs a design and analysis of computer experiments (DACE) approach, where the ''computer experiment'' yields points on the value function curve. The DACE approach has been used to numerically solve high-dimensional, continuous-state stochastic dynamic programming, and performs two tasks primarily: (1) design of experiments and (2) statistical modeling. The use of design of experiments enables more efficient discretization. However, identifying the appropriate sample size is not straightforward. Furthermore, identifying the appropriate model structure is a well-known problem in the field of statistics. In this paper, we present a sequential method that can adaptively determine both sample size and model structure. Number-theoretic methods (NTM) are used to sequentially grow the experimental design because of their ability to fill the design space. Feed-forward neural networks (NNs) are used for statistical modeling because of their adjustability in structure-complexity . This adaptive value function approximation (AVFA) method must be automated to enable efficient implementation within ADP. An AVFA algorithm is introduced, that increments the size of the state space training data in each sequential step, and for each sample size a successive model search process is performed to find an optimal NN model. The new algorithm is tested on a nine-dimensional inventory forecasting problem.