An Optimal Approximate Dynamic Programming Algorithm for the Lagged Asset Acquisition Problem

Authors:
Juliana M. Nascimento;Warren B. Powell
Affiliations:
Department of Operations Research and Financial Engineering, Princeton University, Princeton, New Jersey 08544;Department of Operations Research and Financial Engineering, Princeton University, Princeton, New Jersey 08544
Venue:
Mathematics of Operations Research
Year:
2009

Citing 22
Cited 4

Stochastic decomposition: an algorithm for two-state linear programs with recourse

Mathematics of Operations Research
Probability

Probability
Technical Note: \cal Q-Learning

Machine Learning
Asynchronous Stochastic Approximation and Q-Learning

Machine Learning
Learning to act using real-time dynamic programming

Artificial Intelligence - Special volume on computational research on interaction and agency, part 1
Q-learning: a tutorial and extensions

MANNA '95 Proceedings of the first international conference on Mathematics of neural networks : models, algorithms and applications: models, algorithms and applications
Convergent cutting-plane and partial-sampling algorithm for multistage stochastic linear programs with recourse

Journal of Optimization Theory and Applications
Convergence Results for Single-Step On-PolicyReinforcement-Learning Algorithms

Machine Learning
Introduction to Reinforcement Learning

Introduction to Reinforcement Learning
Neuro-Dynamic Programming

Neuro-Dynamic Programming
Stochastic Approximation for Nonexpansive Maps: Application to Q-Learning Algorithms

SIAM Journal on Control and Optimization
An Adaptive Dynamic Programming Algorithm for Dynamic Fleet Management, I: Single Period Travel Times

Transportation Science
On the convergence of optimistic policy iteration

The Journal of Machine Learning Research
Revenue Management Without Forecasting or Optimization: An Adaptive Algorithm for Determining Airline Seat Protection Levels

Management Science
Learning Rates for Q-learning

The Journal of Machine Learning Research
Learning Algorithms for Separable Approximations of Discrete Stochastic Optimization Problems

Mathematics of Operations Research
Dynamic-Programming Approximations for Stochastic Time-Staged Integer Multicommodity-Flow Problems

INFORMS Journal on Computing
A Nonparametric Approach to Multiproduct Pricing

Operations Research
Approximate Dynamic Programming: Solving the Curses of Dimensionality (Wiley Series in Probability and Statistics)

Approximate Dynamic Programming: Solving the Curses of Dimensionality (Wiley Series in Probability and Statistics)
On the convergence of stochastic iterative dynamic programming algorithms

Neural Computation
Provably Near-Optimal Sampling-Based Policies for Stochastic Inventory Control Models

Mathematics of Operations Research
An algorithm for approximating piecewise linear concave functions from sample gradients

Operations Research Letters

Approximate dynamic programming: lessons from the field

Proceedings of the 40th Conference on Winter Simulation
Dynamic Programming Models and Algorithms for the Mutual Fund Cash Balance Problem

Management Science
Integrated Optimization of Procurement, Processing, and Trade of Commodities

Operations Research
SMART: A Stochastic Multiscale Model for the Analysis of Energy Resources, Technology, and Policy

INFORMS Journal on Computing

Quantified Score

Hi-index	0.00

Visualization

Abstract

We consider a multistage asset acquisition problem where assets are purchased now, at a price that varies randomly over time, to be used to satisfy a random demand at a particular point in time in the future. We provide a rare proof of convergence for an approximate dynamic programming algorithm using pure exploitation, where the states we visit depend on the decisions produced by solving the approximate problem. The resulting algorithm does not require knowing the probability distribution of prices or demands, nor does it require any assumptions about its functional form. The algorithm and its proof rely on the fact that the true value function is a family of piecewise linear concave functions.