Dynamic programming: deterministic and stochastic models

Authors:
Dimitri P. Bertsekas
Affiliations:
-
Venue:
Dynamic programming: deterministic and stochastic models
Year:
1987

Citing 0
Cited 123

Computing Optimal Checkpointing Strategies for Rollback and Recovery Systems

IEEE Transactions on Computers - Fault-Tolerant Computing
Dyna, an integrated architecture for learning, planning, and reacting

ACM SIGART Bulletin
A framework for integrating perception, action, and trial-and-error learning

ACM SIGART Bulletin
Closed-loop control with delayed information

SIGMETRICS '92/PERFORMANCE '92 Proceedings of the 1992 ACM SIGMETRICS joint international conference on Measurement and modeling of computer systems
Markov decision processes in large state spaces

COLT '95 Proceedings of the eighth annual conference on Computational learning theory
Optimal scheduling of handoffs in cellular networks

IEEE/ACM Transactions on Networking (TON)
Bounding errors introduced by clustering of customers in closed product-form queuing networks

Journal of the ACM (JACM)
Importance sampling for Markov chains: computing variance and determining optimal measures

WSC '96 Proceedings of the 28th conference on Winter simulation
On optimal call admission control in cellular networks

Wireless Networks
Processor Assignment and Execution Sequence for Multiversion Software

IEEE Transactions on Computers
Surfing as a real option

Proceedings of the first international conference on Information and computation economies
Broadcast scheduling for information distribution

Wireless Networks
Colearning in Differential Games

Machine Learning
Reinforcement learning and mistake bounded algorithms

COLT '99 Proceedings of the twelfth annual conference on Computational learning theory
Efficient atomic broadcast using deterministic merge

Proceedings of the nineteenth annual ACM symposium on Principles of distributed computing
A Study of Reinforcement Learning in the Continuous Case by the Means of Viscosity Solutions

Machine Learning
Controlling the robots of Web search engines

Proceedings of the 2001 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
Reinforcement learning for fuzzy agents: application to a pighouse environment control

New learning paradigms in soft computing
Embedding a Priori Knowledge in Reinforcement Learning

Journal of Intelligent and Robotic Systems
Dynamic scheduling to minimize lost sales subject to set-up costs

Queueing Systems: Theory and Applications
Worst Traffic Passing Virtual Frame Regulation: Analysis with Dynamic Programming

Queueing Systems: Theory and Applications
Near-Optimal Reinforcement Learning in Polynomial Time

Machine Learning
Variable Resolution Discretization in Optimal Control

Machine Learning
Recent Advances in Hierarchical Reinforcement Learning

Discrete Event Dynamic Systems
Optimal Remapping in Dynamic Bulk Synchronous Computations via a Stochastic Control Approach

IEEE Transactions on Parallel and Distributed Systems
Problem Decomposition for Behavioural Cloning

ECML '00 Proceedings of the 11th European Conference on Machine Learning
Optimal Remapping in Dynamic Bulk Synchronous Computations via a Stochastic Control Approach

IPDPS '02 Proceedings of the 16th International Parallel and Distributed Processing Symposium
Optimization and Simulation: Sequential Packing of Flexible Objects Using Evolutionary Algorithms

SAGA '01 Proceedings of the International Symposium on Stochastic Algorithms: Foundations and Applications
Model and Optimal Call Admission Policy in Cellular Mobile Networks

NETWORKING '00 Proceedings of the IFIP-TC6 / European Commission International Conference on Broadband Communications, High Performance Networking, and Performance of Communication Networks
A POMDP formulation of preference elicitation problems

Eighteenth national conference on Artificial intelligence
A theoretical study on an accurate reconstruction of multiview images based on the Viterbi algorithm

ICIP '95 Proceedings of the 1995 International Conference on Image Processing (Vol.2)-Volume 2 - Volume 2
Risk-averse auction agents

AAMAS '03 Proceedings of the second international joint conference on Autonomous agents and multiagent systems
On the undecidability of probabilistic planning and related stochastic optimization problems

Artificial Intelligence - special issue on planning with uncertainty and incomplete information
Recent Advances in Hierarchical Reinforcement Learning

Discrete Event Dynamic Systems
Efficient rate-controlled bulk data transfer using multiple multicast groups

IEEE/ACM Transactions on Networking (TON)
Information Flows in Capacitated Supply Chains with Fixed Ordering Costs

Management Science
A New Decision Rule for Lateral Transshipments in Inventory Systems

Management Science
Routing Of Airplanes To Two Runways: Monotonicity Of Optimal Controls

Probability in the Engineering and Informational Sciences
A Call Admission Control for Service Differentiation and Fairness Management in WDM Grooming Networks

BROADNETS '04 Proceedings of the First International Conference on Broadband Networks
An empirical study of policy convergence in Markov decision process value iteration

Computers and Operations Research
Computational intelligence for structured learning of a partner robot based on imitation

Information Sciences—Informatics and Computer Science: An International Journal - Special issue: Intelligent embedded agents
Media and data traffic coexistence in power-controlled wireless networks

WMuNeP '05 Proceedings of the 1st ACM workshop on Wireless multimedia networking and performance modeling
Online Evolution for a Self-Adapting Robotic Navigation System Using Evolvable Hardware

Artificial Life
Correctness of Local Probability Propagation in Graphical Models with Loops

Neural Computation
A Powerful Approach for Effective Finding of Significantly Differentially Expressed Genes

IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Efficient approximate planning in continuous space Markovian Decision Problems

AI Communications
Reliable Due-Date Setting in a Capacitated MTO System with Two Customer Classes

Operations Research
Pricing Path-Dependent Securities by the Extended Tree Method

Management Science
Restless Bandit Marginal Productivity Indices, Diminishing Returns, and Optimal Control of Make-to-Order/Make-to-Stock M/G/1 Queues

Mathematics of Operations Research
A layered approach to learning coordination knowledge in multiagent environments

Applied Intelligence
Restricted gradient-descent algorithm for value-function approximation in reinforcement learning

Artificial Intelligence
Accelerating autonomous learning by using heuristic selection of actions

Journal of Heuristics
On the convergence of stochastic iterative dynamic programming algorithms

Neural Computation
Resilient dynamic power management under uncertainty

Proceedings of the conference on Design, automation and test in Europe
Adaptive parameter control of evolutionary algorithms to improve quality-time trade-off

Applied Soft Computing
Improving the Exploration Strategy in Bandit Algorithms

Learning and Intelligent Optimization
Auctions for Resource Allocation in Overlay Networks

Network Control and Optimization
Indefinite-horizon POMDPs with action-based termination

AAAI'07 Proceedings of the 22nd national conference on Artificial intelligence - Volume 2
A convergent reinforcement learning algorithm in the continuous case based on a finite difference method

IJCAI'97 Proceedings of the Fifteenth international joint conference on Artifical intelligence - Volume 2
Skill reconstruction as induction of LQ controllers with subgoals

IJCAI'97 Proceedings of the Fifteenth international joint conference on Artifical intelligence - Volume 2
Accelerating reinforcement learning through implicit imitation

Journal of Artificial Intelligence Research
Reinforcement learning: a survey

Journal of Artificial Intelligence Research
A model approximation scheme for planning in partially observable stochastic domains

Journal of Artificial Intelligence Research
Dealing with geometric constraints in game-theoretic planning

IJCAI'99 Proceedings of the 16th international joint conference on Artificial intelligence - Volume 2
Bounding the suboptimality of reusing subproblems

IJCAI'99 Proceedings of the 16th international joint conference on Artificial intelligence - Volume 2
Laplace distribution based Lagrangian rate distortion optimization for hybrid video coding

IEEE Transactions on Circuits and Systems for Video Technology
Utility-based on-line exploration for repeated navigation in an embedded graph

Artificial Intelligence
Learning to act using real-time dynamic programming

Artificial Intelligence
Delay-optimal power and precoder adaptation for multi-stream MIMO systems

IEEE Transactions on Wireless Communications
Low complexity precoder design for delay sensitive multi-stream MIMO systems

WCNC'09 Proceedings of the 2009 IEEE conference on Wireless Communications & Networking Conference
Bayesian quickest change process detection

ISIT'09 Proceedings of the 2009 IEEE international conference on Symposium on Information Theory - Volume 1
A retrospective on adaptive dynamic programming for control

IJCNN'09 Proceedings of the 2009 international joint conference on Neural Networks
Computational intelligence for structured learning of a partner robot based on imitation

Information Sciences: an International Journal
Inference in credal networks: branch-and-bound methods and the A/R+ algorithm

International Journal of Approximate Reasoning
Restless watchdog: selective quickest spectrum sensing in multichannel cognitive radio systems

EURASIP Journal on Advances in Signal Processing - Special issue on dynamic spectrum access for wireless networking
Fuzzy decision tree function approximation in reinforcement learning

International Journal of Artificial Intelligence and Soft Computing
On the Topology of Discrete Strategies

International Journal of Robotics Research
Delay-optimal resource allocation for OFDMA systems via stochastic approximation

GLOBECOM'09 Proceedings of the 28th IEEE conference on Global telecommunications
Restless watchdog: monitoring multiple bands with blind period in cognitive radio systems

ICC'09 Proceedings of the 2009 IEEE international conference on Communications
Quickest change detection of a Markov process across a sensor array

IEEE Transactions on Information Theory
Stochastic optimal control for small noise intensities: the discrete-time case

WSEAS Transactions on Mathematics
Monitoring the progress of anytime problem-solving

AAAI'96 Proceedings of the thirteenth national conference on Artificial intelligence - Volume 2
Distributive stochastic learning for delay-optimal OFDMA power and subband allocation

IEEE Transactions on Signal Processing
Reinforcement learning with time

AAAI'97/IAAI'97 Proceedings of the fourteenth national conference on artificial intelligence and ninth conference on Innovative applications of artificial intelligence
MDP-based lightpath establishment for service differentiation in all-optical WDM networks with wavelength conversion capability

Photonic Network Communications
Adaptive ε-greedy exploration in reinforcement learning based on value differences

KI'10 Proceedings of the 33rd annual German conference on Advances in artificial intelligence
A minimum relative entropy principle for learning and acting

Journal of Artificial Intelligence Research
On optimal call admission control in cellular networks

INFOCOM'96 Proceedings of the Fifteenth annual joint conference of the IEEE computer and communications societies conference on The conference on computer communications - Volume 1
Quantifying the Degree of Self-Nestedness of Trees: Application to the Structural Analysis of Plants

IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
A Markovian process modeling for Pickomino

CG'10 Proceedings of the 7th international conference on Computers and games
On optimal scheduling algorithms for small generalized switches

IEEE/ACM Transactions on Networking (TON)
TECHNICAL NOTE---The Adaptive Knapsack Problem with Stochastic Rewards

Operations Research
Optimizing revenue for bandwidth auctions over networks with time reservations

Computer Networks: The International Journal of Computer and Telecommunications Networking
A dynamic programming strategy to balance exploration and exploitation in the bandit problem

Annals of Mathematics and Artificial Intelligence
A dynamic programming approach: Improving the performance of wireless networks

Journal of Parallel and Distributed Computing
A geometric approach to find nondominated policies to imprecise reward MDPs

ECML PKDD'11 Proceedings of the 2011 European conference on Machine learning and knowledge discovery in databases - Volume Part I
Policy invariance under reward transformations for general-sum stochastic games

Journal of Artificial Intelligence Research
The Simplex and Policy-Iteration Methods Are Strongly Polynomial for the Markov Decision Problem with a Fixed Discount Rate

Mathematics of Operations Research
Feasibility study of utility-directed behaviour for computer game agents

Proceedings of the 8th International Conference on Advances in Computer Entertainment Technology
On the complexity of policy iteration

UAI'99 Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence
A possibilistic model for qualitative sequential decision problems under uncertainty in partially observable environments

UAI'99 Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence
On the complexity of solving Markov decision problems

UAI'95 Proceedings of the Eleventh conference on Uncertainty in artificial intelligence
Fast value iteration for goal-directed Markov decision processes

UAI'97 Proceedings of the Thirteenth conference on Uncertainty in artificial intelligence
Constraining influence diagram structure by generative planning: an application to the optimization of oil spill response

UAI'96 Proceedings of the Twelfth international conference on Uncertainty in artificial intelligence
Inference in polytrees with sets of probabilities

UAI'03 Proceedings of the Nineteenth conference on Uncertainty in Artificial Intelligence
An adaptive framework for solving multiple hard problems under time constraints

CIS'05 Proceedings of the 2005 international conference on Computational Intelligence and Security - Volume Part I
ARKAQ-learning: autonomous state space segmentation and policy generation

ISCIS'05 Proceedings of the 20th international conference on Computer and Information Sciences
Kernel-Based reinforcement learning

ICIC'06 Proceedings of the 2006 international conference on Intelligent Computing - Volume Part I
A framework for reactive motion and sensing planning: a critical events-based approach

MICAI'05 Proceedings of the 4th Mexican international conference on Advances in Artificial Intelligence
Building Intelligent Interactive Tutors: Student-centered strategies for revolutionizing e-learning

Building Intelligent Interactive Tutors: Student-centered strategies for revolutionizing e-learning
Simulation-Based graph similarity

TACAS'06 Proceedings of the 12th international conference on Tools and Algorithms for the Construction and Analysis of Systems
Sensing and Filtering: A Fresh Perspective Based on Preimages and Information Spaces

Foundations and Trends in Robotics
Leveraging domain knowledge to learn normative behavior: a bayesian approach

ALA'11 Proceedings of the 11th international conference on Adaptive and Learning Agents
Optimal inventory policies with non-stationary supply disruptions and advance supply information

Decision Support Systems
Optimal Service Control of a Serial Production Line with Unreliable Workstations and Random Demand

Automatica (Journal of IFAC)
Optimal control of renewable resources with alternative use

Mathematical and Computer Modelling: An International Journal
Text segmentation by product partition models and dynamic programming

Mathematical and Computer Modelling: An International Journal
Optimal control of polling models for transportation applications

Mathematical and Computer Modelling: An International Journal
Full length article: Distributed adaptive bit-loading for spectrum optimization in multi-user multicarrier systems

Physical Communication
A call admission control for service differentiation and fairness management in WDM grooming networks

Optical Switching and Networking
Solving H-horizon, stationary Markov decision problems in time proportional to log(H)

Operations Research Letters
Modular value iteration through regional decomposition

AGI'12 Proceedings of the 5th international conference on Artificial General Intelligence
Observability-based local path planning and obstacle avoidance using bearing-only measurements

Robotics and Autonomous Systems

Quantified Score

Hi-index	0.07

Dynamic programming: deterministic and stochastic models

Quantified Score

Visualization

Abstract