A menu of designs for reinforcement learning over time
Neural networks for control
Stochastic decomposition: an algorithm for two-state linear programs with recourse
Mathematics of Operations Research
Technical Note: \cal Q-Learning
Machine Learning
Feature-based methods for large scale dynamic programming
Machine Learning - Special issue on reinforcement learning
Iterative Dynamic Programming
Introduction to Reinforcement Learning
Introduction to Reinforcement Learning
Neuro-Dynamic Programming
Parallel Tabu Search for Real-Time Vehicle Routing and Dispatching
Transportation Science
Introduction to Stochastic Search and Optimization
Introduction to Stochastic Search and Optimization
Metaheuristics in combinatorial optimization: Overview and conceptual comparison
ACM Computing Surveys (CSUR)
Handbook of Learning and Approximate Dynamic Programming (IEEE Press Series on Computational Intelligence)
Learning Algorithms for Separable Approximations of Discrete Stochastic Optimization Problems
Mathematics of Operations Research
Simulation-based Algorithms for Markov Decision Processes (Communications and Control Engineering)
Simulation-based Algorithms for Markov Decision Processes (Communications and Control Engineering)
Dynamic-Programming Approximations for Stochastic Time-Staged Integer Multicommodity-Flow Problems
INFORMS Journal on Computing
A Price-Directed Approach to Stochastic Inventory/Routing
Operations Research
Exploiting Knowledge About Future Demands for Real-Time Vehicle Dispatching
Transportation Science
Approximate Dynamic Programming: Solving the Curses of Dimensionality (Wiley Series in Probability and Statistics)
Mathematics of Operations Research
Efficient solution algorithms for factored MDPs
Journal of Artificial Intelligence Research
Learning to act using real-time dynamic programming
Artificial Intelligence
Commentary---Perspectives on Stochastic Optimization Over Time
INFORMS Journal on Computing
INFORMS Journal on Computing
Rejoinder---The Languages of Stochastic Optimization
INFORMS Journal on Computing
Policy iteration based on a learned transition model
ECML PKDD'12 Proceedings of the 2012 European conference on Machine Learning and Knowledge Discovery in Databases - Volume Part II
A dynamic programming approximation for downlink channel allocation in cognitive femtocell networks
Computer Networks: The International Journal of Computer and Telecommunications Networking
Hi-index | 0.00 |
We consider the problem of optimizing over time hundreds or thousands of discrete entities that may be characterized by relatively complex attributes, in the presence of different forms of uncertainty. Such problems arise in a range of operational settings such as transportation and logistics, where the entities may be aircraft, locomotives, containers, or people. These problems can be formulated using dynamic programming but encounter the widely cited “curse of dimensionality.” Even deterministic formulations of these problems can produce math programs with millions of rows, far beyond anything being solved today. This paper shows how we can combine concepts from artificial intelligence and operations research to produce practical solution methods that scale to industrial-strength problems. Throughout, we emphasize concepts, techniques, and notation from artificial intelligence and operations research to show how these fields can be brought together for complex stochastic, dynamic problems.