Neuro-Dynamic Programming

Authors:
Dimitri P. Bertsekas;John N. Tsitsiklis
Affiliations:
-;-
Venue:
Neuro-Dynamic Programming
Year:
1996

Citing 0
Cited 569

Mean-field theory for batched TD (&lgr;)

Neural Computation
Learning agents for uncertain environments (extended abstract)

COLT' 98 Proceedings of the eleventh annual conference on Computational learning theory
Solving very large weakly coupled Markov decision processes

AAAI '98/IAAI '98 Proceedings of the fifteenth national/tenth conference on Artificial intelligence/Innovative applications of artificial intelligence
Incremental Gradient Algorithms with Stepsizes Bounded Away from Zero

Computational Optimization and Applications
Fast Online Q(λ)

Machine Learning
Elevator Group Control Using Multiple Reinforcement Learning Agents

Machine Learning
Learning Team Strategies: Soccer Case Studies

Machine Learning
Reinforcement learning and mistake bounded algorithms

COLT '99 Proceedings of the twelfth annual conference on Computational learning theory
Convergence analysis of temporal-difference learning algorithms with linear function approximation

COLT '99 Proceedings of the twelfth annual conference on Computational learning theory
Toward a Model of Intelligence as an Economy of Agents

Machine Learning
Congestion-dependent pricing of network services

IEEE/ACM Transactions on Networking (TON)
Learning to Play Chess Using Temporal Differences

Machine Learning
A Study of Reinforcement Learning in the Continuous Case by the Means of Viscosity Solutions

Machine Learning
Improved results for route planning in stochastic transportation

SODA '01 Proceedings of the twelfth annual ACM-SIAM symposium on Discrete algorithms
On the Convergence of Temporal-Difference Learning with Linear Function Approximation

Machine Learning
Computational challenges in portfolio management

Computing in Science and Engineering
Simulation optimization

Proceedings of the 33nd conference on Winter simulation
On improving the performance of simulation-based algorithms for average reward processes with application to network pricing

Proceedings of the 33nd conference on Winter simulation
Itanium's new basic operation of fused multiply-add: theoretical explanation and theoretical challenge

ACM SIGACT News
Optimal plans for aggregation

Proceedings of the twenty-first annual symposium on Principles of distributed computing
Optimal resource allocation in multi-class networks with user-specified utility functions

Computer Networks: The International Journal of Computer and Telecommunications Networking
A Hybrid Architecture for Situated Learning of Reactive Sequential Decision Making

Applied Intelligence
Planning and Control in Artificial Intelligence: A Unifying Perspective

Applied Intelligence
A Model of Partially Observable State Game and its Optimality

Applied Intelligence
Reinforcement Learning Soccer Teams with Incomplete World Models

Autonomous Robots
The Relations Among Potentials, Perturbation Analysis,and Markov Decision Processes

Discrete Event Dynamic Systems
Dynamics of Transmission Provision in a Competitive PowerIndustry

Discrete Event Dynamic Systems
Rollout Algorithms for Combinatorial Optimization

Journal of Heuristics
Rollout Algorithms for Stochastic Scheduling Problems

Journal of Heuristics
On the value function of a priority queue with an application to a controlled polling model

Queueing Systems: Theory and Applications
Introduction

Machine Learning
Reinforcement Learning for Call Admission Control and Routing under Quality of Service Constraints in Multimedia Networks

Machine Learning
Near-Optimal Reinforcement Learning in Polynomial Time

Machine Learning
Technical Update: Least-Squares Temporal Difference Learning

Machine Learning
Risk-Sensitive Reinforcement Learning

Machine Learning
Variable Resolution Discretization in Optimal Control

Machine Learning
Linear waste of best fit bin packing on skewed distributions

Random Structures & Algorithms - Probabilistic methods in combinatorial optimization
Dopamine: generalization and bonuses

Neural Networks - Computational models of neuromodulation
Opponent interactions between serotonin and dopamine

Neural Networks - Computational models of neuromodulation
From Perturbation Analysis to Markov Decision Processes and Reinforcement Learning

Discrete Event Dynamic Systems
Recent Advances in Hierarchical Reinforcement Learning

Discrete Event Dynamic Systems
Least Squares Policy Evaluation Algorithms with Linear Function Approximation

Discrete Event Dynamic Systems
A Topic-Specific Web Robot Model Based on Restless Bandits

IEEE Internet Computing
Sequence Learning: From Recognition and Prediction to Sequential Decision Making

IEEE Intelligent Systems
Optimal control using the transport equation: the Liouville machine

Adaptive Behavior
Learning of plan execution policies for indoor navigation

AI Communications - Special issue on KI-2001
Optimizing hypervideo navigation using a Markov decision process approach

Proceedings of the tenth ACM international conference on Multimedia
Learning to play strong poker

Machines that learn to play games
Towards a Universal Theory of Artificial Intelligence Based on Algorithmic Probability and Sequential Decisions

EMCL '01 Proceedings of the 12th European Conference on Machine Learning
Learning While Exploring: Bridging the Gaps in the Eligibility Traces

EMCL '01 Proceedings of the 12th European Conference on Machine Learning
Propagation of Q-values in Tabular TD(lambda)

ECML '02 Proceedings of the 13th European Conference on Machine Learning
Q-Cut - Dynamic Discovery of Sub-goals in Reinforcement Learning

ECML '02 Proceedings of the 13th European Conference on Machine Learning
Characterizing Markov Decision Processes

ECML '02 Proceedings of the 13th European Conference on Machine Learning
Co-evolving a Neural-Net Evaluation Function for Othello by Combining Genetic Algorithms and Reinforcement Learning

ICCS '01 Proceedings of the International Conference on Computational Science-Part II
A Framework for Supporting Intelligent Fault and Performance Management for Communication Networks

MMNS '01 Proceedings of the 4th IFIP/IEEE International Conference on Management of Multimedia Networks and Services: Management of Multimedia on the Internet
Minimizing Transmission Costs through Adaptive Marking in Differentiated Services Networks

MMNS '02 Proceedings of the 5th IFIP/IEEE International Conference on Management of Multimedia Networks and Services: Management of Multimedia on the Internet
An Overview of MAXQ Hierarchical Reinforcement Learning

SARA '02 Proceedings of the 4th International Symposium on Abstraction, Reformulation, and Approximation
Reinforcement Learning: Past, Present and Future

SEAL'98 Selected papers from the Second Asia-Pacific Conference on Simulated Evolution and Learning on Simulated Evolution and Learning
Least-Squares Methods in Reinforcement Learning for Control

SETN '02 Proceedings of the Second Hellenic Conference on AI: Methods and Applications of Artificial Intelligence
Modelling Intelligent Behaviour: The Markov Decision Process Approach

IBERAMIA '98 Proceedings of the 6th Ibero-American Conference on AI: Progress in Artificial Intelligence
An Analysis of the Pheromone Q-Learning Algorithm

IBERAMIA 2002 Proceedings of the 8th Ibero-American Conference on AI: Advances in Artificial Intelligence
Hybrid Framework for Neuro-Dynamic Programming Application to Water Supply Networks

IWANN '01 Proceedings of the 6th International Work-Conference on Artificial and Natural Neural Networks: Bio-inspired Applications of Connectionism-Part II
Rationality Assumptions and Optimality of Co-learning

PRIMA '00 Proceedings of the Third Pacific Rim International Workshop on Multi-Agents: Design and Applications of Intelligent Agents
Karlsruhe Brainstormers - A Reinforcement Learning Approach to Robotic Soccer

RoboCup 2001: Robot Soccer World Cup V
Different Local Search Algorithms in STAGE for Solving Bin Packing Problem

EurAsia-ICT '02 Proceedings of the First EurAsian Conference on Information and Communication Technology
Learning to Behave by Environment Reinforcement

RoboCup-99: Robot Soccer World Cup III
Computing Minimum and Maximum Reachability Times in Probabilistic Systems

CONCUR '99 Proceedings of the 10th International Conference on Concurrency Theory
An Improved Q-Learning Algorithm Using Synthetic Pheromones

CEEMAS '01 Revised Papers from the Second International Workshop of Central and Eastern Europe on Multi-Agent Systems: From Theory to Practice in Multi-Agent Systems
Distributed Learning and Control for Manufacturing Systems Scheduling

Proceedings of the 14th International conference on Industrial and engineering applications of artificial intelligence and expert systems: engineering of intelligent systems
Controller Scheduling Using Neural Networks: Implementation and Experimental Results

Hybrid Systems V
On the Asymptotic Behaviour of a Constant Stepsize Temporal-Difference Learning Algorithm

EuroCOLT '99 Proceedings of the 4th European Conference on Computational Learning Theory
Open Theoretical Questions in Reinforcement Learning

EuroCOLT '99 Proceedings of the 4th European Conference on Computational Learning Theory
Introduction to Sequence Learning

Sequence Learning - Paradigms, Algorithms, and Applications
Automatic Segmentation of Sequences through Hierarchical Reinforcement Learning

Sequence Learning - Paradigms, Algorithms, and Applications
Sequential Decision Making Based on Direct Search

Sequence Learning - Paradigms, Algorithms, and Applications
Towards Stochastic Constraint Programming: A Study of Online Multi-choice Knapsack with Deadlines

CP '01 Proceedings of the 7th International Conference on Principles and Practice of Constraint Programming
Restart Policies with Dependence among Runs: A Dynamic Programming Approach

CP '02 Proceedings of the 8th International Conference on Principles and Practice of Constraint Programming
Logic, Knowledge Representation, and Bayesian Decision Theory

CL '00 Proceedings of the First International Conference on Computational Logic
Dynamic Pricing of Information Products Based on Reinforcement Learning: A Yield-Management Approach

KI '02 Proceedings of the 25th Annual German Conference on AI: Advances in Artificial Intelligence
Feedforward Neural Networks in Reinforcement Learning Applied to High-Dimensional Motor Control

ALT '02 Proceedings of the 13th International Conference on Algorithmic Learning Theory
To Collect or Not to Collect? Machine Learning for Memory Management

Proceedings of the 2nd Java Virtual Machine Research and Technology Symposium
Learning Rates for Q-Learning

COLT '01/EuroCOLT '01 Proceedings of the 14th Annual Conference on Computational Learning Theory and and 5th European Conference on Computational Learning Theory
PAC Bounds for Multi-armed Bandit and Markov Decision Processes

COLT '02 Proceedings of the 15th Annual Conference on Computational Learning Theory
Using Rollouts to Induce a Policy from a User Model

UM '01 Proceedings of the 8th International Conference on User Modeling 2001
Adaptive Strategies and Regret Minimization in Arbitrarily Varying Markov Environments

COLT '01/EuroCOLT '01 Proceedings of the 14th Annual Conference on Computational Learning Theory and and 5th European Conference on Computational Learning Theory
Memetic-neural scheduler of jobs in identical parallel machines

Second international workshop on Intelligent systems design and application
Sequential cost-sensitive decision making with reinforcement learning

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Polynomial-time reinforcement learning of near-optimal policies

Eighteenth national conference on Artificial intelligence
Greedy linear value-approximation for factored Markov decision processes

Eighteenth national conference on Artificial intelligence
Piecewise linear value function approximation for factored MDPs

Eighteenth national conference on Artificial intelligence
Multi-agent learning in extensive games with complete information

AAMAS '03 Proceedings of the second international joint conference on Autonomous agents and multiagent systems
An introduction to reinforcement learning theory: value function methods

Advanced lectures on machine learning
Solving factored MDPs using non-homogeneous partitions

Artificial Intelligence - special issue on planning with uncertainty and incomplete information
The empirical Bayes envelope and regret minimization in competitive Markov decision processes

Mathematics of Operations Research
Recent Advances in Hierarchical Reinforcement Learning

Discrete Event Dynamic Systems
Analysis of a Rollout Approach to Sequencing Problems with Stochastic Routing Applications

Journal of Heuristics
On the convergence of optimistic policy iteration

The Journal of Machine Learning Research
Lyapunov design for safe reinforcement learning

The Journal of Machine Learning Research
Adaptive Radial Basis Decomposition by Learning Vector Quantization

Neural Processing Letters
On finding global optima for the hinge fitting problem

Computers and Operations Research
Reinforcing reachable routes

Computer Networks: The International Journal of Computer and Telecommunications Networking
Least-squares policy iteration

The Journal of Machine Learning Research
Distributed Reinforcement Learning Control for Batch Sequencing and Sizing in Just-In-Time Manufacturing Systems

Applied Intelligence
A Reinforcement Learning Algorithm Based on Policy Iteration for Average Reward: Empirical Results with Yield Management and Convergence Analysis

Machine Learning
Combining importance sampling and temporal difference control variates to simulate Markov Chains

ACM Transactions on Modeling and Computer Simulation (TOMACS)
CONVERGENCE OF SIMULATION-BASED POLICY ITERATION

Probability in the Engineering and Informational Sciences
A LEARNING ALGORITHM FOR DISCRETE-TIME STOCHASTIC CONTROL

Probability in the Engineering and Informational Sciences
Parallel Rollout for Online Solution of Partially Observable Markov Decision Processes

Discrete Event Dynamic Systems
Online pricing for bandwidth provisioning in multi-class networks

Computer Networks: The International Journal of Computer and Telecommunications Networking
Learning Rates for Q-learning

The Journal of Machine Learning Research
A Geometric Approach to Multi-Criterion Reinforcement Learning

The Journal of Machine Learning Research
A generic architecture for adaptive agents based on reinforcement learning

Information Sciences—Informatics and Computer Science: An International Journal - Special issue: Bio-inspired systems (BIS)
Dynamic bipedal walking assisted by learning

Robotica
An Explicit Solution for the Value Function of a Priority Queue

Queueing Systems: Theory and Applications
Dynamic abstraction in reinforcement learning via clustering

ICML '04 Proceedings of the twenty-first international conference on Machine learning
Convergence of synchronous reinforcement learning with linear function approximation

ICML '04 Proceedings of the twenty-first international conference on Machine learning
Modeling correlations in web traces and implications for designing replacement policies

Computer Networks: The International Journal of Computer and Telecommunications Networking
Reinforcement Learning with Factored States and Actions

The Journal of Machine Learning Research
Integrating Guidance into Relational Reinforcement Learning

Machine Learning
A Dynamic Pricing Mechanisms for P2P Referral Systems

AAMAS '04 Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems - Volume 3
Planning and programming with first-order markov decision processes: insights and challenges

TARK '01 Proceedings of the 8th conference on Theoretical aspects of rationality and knowledge
New simulation methodology for finance: duality theory and simulation in financial engineering

Proceedings of the 35th conference on Winter simulation: driving innovation
Routing Of Airplanes To Two Runways: Monotonicity Of Optimal Controls

Probability in the Engineering and Informational Sciences
Efficient QoS Provisioning for Adaptive Multimedia in Mobile Communication Networks by Reinforcement Learning

BROADNETS '04 Proceedings of the First International Conference on Broadband Networks
A Stochastic Control Model for Deployment of Dynamic Grid Services

GRID '04 Proceedings of the 5th IEEE/ACM International Workshop on Grid Computing
Solving factored MDPs with continuous and discrete variables

UAI '04 Proceedings of the 20th conference on Uncertainty in artificial intelligence
Coordinating Multiple Agents via Reinforcement Learning

Autonomous Agents and Multi-Agent Systems
Review of "Genetic Algorithms for Machine Learning by John J. Greffenstette", Kluwer Academic Publishers, 1993.

ACM SIGART Bulletin
System for foreign exchange trading using genetic algorithms and reinforcement learning

International Journal of Systems Science
Online convex optimization in the bandit setting: gradient descent without a gradient

SODA '05 Proceedings of the sixteenth annual ACM-SIAM symposium on Discrete algorithms
Parallelization Strategies for Rollout Algorithms

Computational Optimization and Applications
Machine learning

Encyclopedia of Computer Science
Computational intelligence for structured learning of a partner robot based on imitation

Information Sciences—Informatics and Computer Science: An International Journal - Special issue: Intelligent embedded agents
Dynamical Mobile Terminal Location Registration in Wireless PCS Networks

IEEE Transactions on Mobile Computing
Optimal Control Using the Transport Equation: The Liouville Machine

Adaptive Behavior - Animals, Animats, Software Agents, Robots, Adaptive Systems
Exploration and apprenticeship learning in reinforcement learning

ICML '05 Proceedings of the 22nd international conference on Machine learning
Reinforcement learning with Gaussian processes

ICML '05 Proceedings of the 22nd international conference on Machine learning
Interactive learning of mappings from visual percepts to actions

ICML '05 Proceedings of the 22nd international conference on Machine learning
Relating reinforcement learning performance to classification performance

ICML '05 Proceedings of the 22nd international conference on Machine learning
Proto-value functions: developmental reinforcement learning

ICML '05 Proceedings of the 22nd international conference on Machine learning
Bounded real-time dynamic programming: RTDP with monotone upper bounds and performance guarantees

ICML '05 Proceedings of the 22nd international conference on Machine learning
Finite time bounds for sampling based fitted value iteration

ICML '05 Proceedings of the 22nd international conference on Machine learning
Bayesian sparse sampling for on-line reward optimization

ICML '05 Proceedings of the 22nd international conference on Machine learning
A Reinforcement Learning Algorithm in Cooperative Multi-Robot Domains

Journal of Intelligent and Robotic Systems
An analytic modelling approach for network routing algorithms that use "ant-like" mobile agents

Computer Networks: The International Journal of Computer and Telecommunications Networking
Stochastic Optimal Control and Estimation Methods Adapted to the Noise Characteristics of the Sensorimotor System

Neural Computation
The concept of a universal learning system as a basis for creating a general mathematical theory of learning

Minds and Machines - Machine learning as experimental philosophy of science
A Unified Analysis of Value-Function-Based Reinforcement Learning Algorithms

Neural Computation
A Sensorimotor Map: Modulating Lateral Interactions for Anticipation and Planning

Neural Computation
A Generalized Kalman Filter for Fixed Point Approximation and Efficient Temporal-Difference Learning

Discrete Event Dynamic Systems
Asymptotic analysis of temporal-difference learning algorithms with constant step-sizes

Machine Learning
Using inaccurate models in reinforcement learning

ICML '06 Proceedings of the 23rd international conference on Machine learning
Approximate Policy Optimization and Adaptive Control in Regression Models

Computational Economics
Efficient QoS provisioning for adaptive multimedia in mobile communication networks by reinforcement learning

Mobile Networks and Applications - Special issue: Recent advances in wireless networking
Function-approximation-based importance sampling for pricing American options

WSC '04 Proceedings of the 36th conference on Winter simulation
The optimizing-simulator: merging simulation and optimization using approximate dynamic programming

WSC '05 Proceedings of the 37th conference on Winter simulation
Approximate dynamic programming in multi-skill call centers

WSC '05 Proceedings of the 37th conference on Winter simulation
Function-approximation-based perfect control variates for pricing American options

WSC '05 Proceedings of the 37th conference on Winter simulation
Adaptive stepsizes for recursive estimation with applications in approximate dynamic programming

Machine Learning
A Neuro-Dynamic Programming-Based Optimal Controller for Tomato Seedling Growth in Greenhouse Systems

Neural Processing Letters
Quantum robot: structure, algorithms and applications

Robotica
The Role of Problem Classification in Online Meta-cognition

IAT '06 Proceedings of the IEEE/WIC/ACM international conference on Intelligent Agent Technology
Heterarchical reinforcement-learning model for integration of multiple cortico-striatal loops: fMRI examination in stimulus-action-reward association learning

Neural Networks - 2006 Special issue: Neurobiology of decision making
Learning tetris using the noisy cross-entropy method

Neural Computation
Neural Network Feedback Control: Work at UTA's Automation and Robotics Research Institute

Journal of Intelligent and Robotic Systems
Dimensions of complexity of intelligent agents

PCAR '06 Proceedings of the 2006 international symposium on Practical cognitive agents and robots
Asymptotic Variance Of Passage Time Estimators In Markov Chains

Probability in the Engineering and Informational Sciences
A Cost-Shaping Linear Program for Average-Cost Approximate Dynamic Programming with Performance Guarantees

Mathematics of Operations Research
Approximate Solutions of a Dynamic Forecast-Inventory Model

Manufacturing & Service Operations Management
Models of the Spiral-Down Effect in Revenue Management

Operations Research
Bias and Variance Approximation in Value Function Estimates

Management Science
A robust Markov game controller for nonlinear systems

Applied Soft Computing
On the Empirical State-Action Frequencies in Markov Decision Processes Under General Policies

Mathematics of Operations Research
Retailer-Supplier Flexible Commitments Contracts: A Robust Optimization Approach

Manufacturing & Service Operations Management
A Price-Directed Approach to Stochastic Inventory/Routing

Operations Research
Robust Control of Markov Decision Processes with Uncertain Transition Matrices

Operations Research
The Dynamic Assignment Problem

Transportation Science
Action Elimination and Stopping Conditions for the Multi-Armed Bandit and Reinforcement Learning Problems

The Journal of Machine Learning Research
Collaborative Multiagent Reinforcement Learning by Payoff Propagation

The Journal of Machine Learning Research
STEWARD: demo of spatio-textual extraction on the web aiding the retrieval of documents

dg.o '07 Proceedings of the 8th annual international conference on Digital government research: bridging disciplines & domains
Point-Based Value Iteration for Continuous POMDPs

The Journal of Machine Learning Research
Reinforcement Learning, Spike-Time-Dependent Plasticity, and the BCM Rule

Neural Computation
Bayesian actor-critic algorithms

Proceedings of the 24th international conference on Machine learning
Automatic shaping and decomposition of reward functions

Proceedings of the 24th international conference on Machine learning
Learning to trade with insider information

Proceedings of the ninth international conference on Electronic commerce
A framework for meta-level control in multi-agent systems

Autonomous Agents and Multi-Agent Systems
Shaping multi-agent systems with gradient reinforcement learning

Autonomous Agents and Multi-Agent Systems
Efficient PAC Learning for Episodic Tasks with Acyclic State Spaces

Discrete Event Dynamic Systems
Design of a peer-to-peer system for optimized content replication

Computer Communications
Optimal Sequential Exploration: A Binary Learning Model

Decision Analysis
Convergence Analysis of Batch Gradient Algorithm for Three Classes of Sigma-Pi Neural Networks

Neural Processing Letters
Application of reinforcement learning to the game of Othello

Computers and Operations Research
Analysis and optimization of service availability in a HA cluster with load-dependent machine availability

IEEE Transactions on Parallel and Distributed Systems
Combined rate and power allocation with link scheduling in wireless data packet relay networks with fading channels

EURASIP Journal on Wireless Communications and Networking
Efficient sampling in approximate dynamic programming algorithms

Computational Optimization and Applications
Universal Intelligence: A Definition of Machine Intelligence

Minds and Machines
A formal framework and extensions for function approximation in learning classifier systems

Machine Learning
Probabilistic incremental program evolution

Evolutionary Computation
Restricted gradient-descent algorithm for value-function approximation in reinforcement learning

Artificial Intelligence
The optimizing-simulator: merging simulation and optimization using approximate dynamic programming

Proceedings of the 39th conference on Winter simulation: 40 years! The best is yet to come
Natural Actor-Critic

Neurocomputing
Learning near-optimal policies with Bellman-residual minimization based fitted policy iteration and a single sample path

Machine Learning
RL-MAC: a reinforcement learning based MAC protocol for wireless sensor networks

International Journal of Sensor Networks
Dynamic modeling and control of supply chain systems: A review

Computers and Operations Research
Biologically-inspired adaptive learning control strategies: A rough set approach

International Journal of Hybrid Intelligent Systems
Approximate dynamic programming for link scheduling in wireless mesh networks

Computers and Operations Research
Knowledge propagation in a distributed omnidirectional vision system

Journal of Intelligent & Fuzzy Systems: Applications in Engineering and Technology - Marco Somalvico Memorial Issue
Error bounds of optimization algorithms for semi-Markov decision processes

International Journal of Systems Science
Self-Optimizing Memory Controllers: A Reinforcement Learning Approach

ISCA '08 Proceedings of the 35th Annual International Symposium on Computer Architecture
A theoretical framework for quality-aware cross-layer optimized wireless multimedia communications

Advances in Multimedia
Reinforcement learning in the presence of rare events

Proceedings of the 25th international conference on Machine learning
Non-parametric policy gradients: a unified treatment of propositional and relational domains

Proceedings of the 25th international conference on Machine learning
A worst-case comparison between temporal difference and residual gradient with linear function approximation

Proceedings of the 25th international conference on Machine learning
An analysis of reinforcement learning with function approximation

Proceedings of the 25th international conference on Machine learning
A semiparametric statistical approach to model-free policy evaluation

Proceedings of the 25th international conference on Machine learning
Learning Control Knowledge for Forward Search Planning

The Journal of Machine Learning Research
Finite-Time Bounds for Fitted Value Iteration

The Journal of Machine Learning Research
Mapping land cover from detailed aerial photography data using textural and neural network analysis

International Journal of Remote Sensing
Tuning continual exploration in reinforcement learning: An optimality property of the Boltzmann strategy

Neurocomputing
Controlling deliberation in a Markov decision process-based agent

Proceedings of the 7th international joint conference on Autonomous agents and multiagent systems - Volume 1
Interaction-driven Markov games for decentralized multiagent planning under uncertainty

Proceedings of the 7th international joint conference on Autonomous agents and multiagent systems - Volume 1
Reinforcement Learning in Nonstationary Environment Navigation Tasks

CAI '07 Proceedings of the 20th conference of the Canadian Society for Computational Studies of Intelligence on Advances in Artificial Intelligence
Making a Robot Learn to Play Soccer Using Reward and Punishment

KI '07 Proceedings of the 30th annual German conference on Advances in Artificial Intelligence
Simulation-Based Optimization Approach for Software Cost Model with Rejuvenation

ATC '08 Proceedings of the 5th international conference on Autonomic and Trusted Computing
A Learning Automata Approach to Multi-agent Policy Gradient Learning

KES '08 Proceedings of the 12th international conference on Knowledge-Based Intelligent Information and Engineering Systems, Part II
Evolution Strategies for Direct Policy Search

Proceedings of the 10th international conference on Parallel Problem Solving from Nature: PPSN X
Fitted Natural Actor-Critic: A New Algorithm for Continuous State-Action MDPs

ECML PKDD '08 Proceedings of the European conference on Machine Learning and Knowledge Discovery in Databases - Part II
Dynamic and Neuro-Dynamic Optimization of a Fed-Batch Fermentation Process

AIMSA '08 Proceedings of the 13th international conference on Artificial Intelligence: Methodology, Systems, and Applications
Adaptive Dynamic Programming for a Class of Nonlinear Control Systems with General Separable Performance Index

ISNN '08 Proceedings of the 5th international symposium on Neural Networks: Advances in Neural Networks, Part II
Value Function Based Reinforcement Learning in Changing Markovian Environments

The Journal of Machine Learning Research
Simulation-Based Optimization Algorithms for Finite-Horizon Markov Decision Processes

Simulation
Learning the Filling Policy of a Biodegradation Process by Fuzzy Actor---Critic Learning Methodology

MICAI '08 Proceedings of the 7th Mexican International Conference on Artificial Intelligence: Advances in Artificial Intelligence
Tank Model Coupled with an Artificial Neural Network

MICAI '08 Proceedings of the 7th Mexican International Conference on Artificial Intelligence: Advances in Artificial Intelligence
Optimistic Planning of Deterministic Systems

Recent Advances in Reinforcement Learning
Policy Iteration for Learning an Exercise Policy for American Options

Recent Advances in Reinforcement Learning
New Error Bounds for Approximations from Projected Linear Equations

Recent Advances in Reinforcement Learning
Markov Decision Processes with Arbitrary Reward Processes

Recent Advances in Reinforcement Learning
A New Learning Algorithm for Optimal Stopping

Discrete Event Dynamic Systems
Optimal parameter trajectory estimation in parameterized SDEs: An algorithmic procedure

ACM Transactions on Modeling and Computer Simulation (TOMACS)
Learning While Optimizing an Unknown Fitness Surface

Learning and Intelligent Optimization
A spiking neural network model of an actor-critic learning agent

Neural Computation
The factored policy-gradient planner

Artificial Intelligence
Practical solution techniques for first-order MDPs

Artificial Intelligence
Factored value iteration converges

Acta Cybernetica
Factored temporal difference learning in the new ties environment

Acta Cybernetica
Some topics for simulation optimization

Proceedings of the 40th Conference on Winter Simulation
Approximate dynamic programming: lessons from the field

Proceedings of the 40th Conference on Winter Simulation
Projected equation methods for approximate solution of large linear systems

Journal of Computational and Applied Mathematics
QL2, a simple reinforcement learning scheme for two-player zero-sum Markov games

Neurocomputing
Gaussian process dynamic programming

Neurocomputing
Reinforcement distribution in fuzzy Q-learning

Fuzzy Sets and Systems
Dynamic routing policies for multiskill call centers

Probability in the Engineering and Informational Sciences
Opportunistic Transmission over Randomly Varying Channels

Network Control and Optimization
Book Reviews

Interfaces
An Optimal Approximate Dynamic Programming Algorithm for the Lagged Asset Acquisition Problem

Mathematics of Operations Research
Reoptimization Approaches for the Vehicle-Routing Problem with Stochastic Demands

Operations Research
Simultaneous Optimal Control and Discrete Stochastic Sensor Selection

HSCC '09 Proceedings of the 12th International Conference on Hybrid Systems: Computation and Control
Reinforcement Learning: A Tutorial Survey and Recent Advances

INFORMS Journal on Computing
The optimizing-simulator: An illustration using the military airlift problem

ACM Transactions on Modeling and Computer Simulation (TOMACS)
A reinforcement learning framework for utility-based scheduling in resource-constrained systems

Future Generation Computer Systems
An Approximate Dynamic Programming Algorithm for Large-Scale Fleet Management: A Case Application

Transportation Science
Constraint relaxation in approximate linear programs

ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Model-free reinforcement learning as mixture learning

ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Performance Evaluation of Direct Heuristic Dynamic Programming using Control-Theoretic Measures

Journal of Intelligent and Robotic Systems
A POMDP framework for coordinated guidance of autonomous UAVs for multitarget tracking

EURASIP Journal on Advances in Signal Processing - Special issue on signal processing advances in robots and autonomy
Online exploration in least-squares policy iteration

Proceedings of The 8th International Conference on Autonomous Agents and Multiagent Systems - Volume 2
Learning of coordination: exploiting sparse interactions in multiagent systems

Proceedings of The 8th International Conference on Autonomous Agents and Multiagent Systems - Volume 2
Scheduling policy design for autonomic systems

International Journal of Autonomous and Adaptive Communications Systems
Reordering Sparsification of Kernel Machines in Approximate Policy Iteration

ISNN 2009 Proceedings of the 6th International Symposium on Neural Networks: Advances in Neural Networks - Part II
Reinforcement Learning Control of a Real Mobile Robot Using Approximate Policy Iteration

ISNN 2009 Proceedings of the 6th International Symposium on Neural Networks: Advances in Neural Networks - Part III
Direct Policy Search Reinforcement Learning for Robot Control

Proceedings of the 2005 conference on Artificial Intelligence Research and Development
An Agent-Based Model for the Adaptation of Processing Efficiency for Prioritized Traffic

KES-AMSTA '09 Proceedings of the Third KES International Symposium on Agent and Multi-Agent Systems: Technologies and Applications
Reinforcement learning for robot soccer

Autonomous Robots
Partially Observable Markov Decision Process Approximations for Adaptive Sensing

Discrete Event Dynamic Systems
Evolutionary-based learning of generalised policies for AI planning domains

Proceedings of the 11th Annual conference on Genetic and evolutionary computation
Apply ant colony optimization to Tetris

Proceedings of the 11th Annual conference on Genetic and evolutionary computation
Multiscale Anticipatory Behavior by Hierarchical Reinforcement Learning

Anticipatory Behavior in Adaptive Learning Systems
Learning Representation and Control in Markov Decision Processes: New Frontiers

Foundations and Trends® in Machine Learning
Randomized shortest-path problems: Two related models

Neural Computation
An Approximate Dynamic Programming Approach to Network Revenue Management with Customer Choice

Transportation Science
Reinforcement learning for a CPG-driven biped robot

AAAI'04 Proceedings of the 19th national conference on Artifical intelligence
Learning representation and control in continuous Markov decision processes

AAAI'06 proceedings of the 21st national conference on Artificial intelligence - Volume 2
Focused real-time dynamic programming for MDPs: squeezing more out of a heuristic

AAAI'06 proceedings of the 21st national conference on Artificial intelligence - Volume 2
Interactively shaping agents via human reinforcement: the TAMER framework

Proceedings of the fifth international conference on Knowledge capture
Online Markov Decision Processes

Mathematics of Operations Research
Markov Decision Processes with Arbitrary Reward Processes

Mathematics of Operations Research
Optimal Online Learning Procedures for Model-Free Policy Evaluation

ECML PKDD '09 Proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases: Part II
Samuel meets Amarel: automating value function approximation using global state space analysis

AAAI'05 Proceedings of the 20th national conference on Artificial intelligence - Volume 2
Error bounds for approximate value iteration

AAAI'05 Proceedings of the 20th national conference on Artificial intelligence - Volume 2
Lazy approximation for solving continuous finite-horizon MDPs

AAAI'05 Proceedings of the 20th national conference on Artificial intelligence - Volume 3
Compact spectral bases for value function approximation using Kronecker factorization

AAAI'07 Proceedings of the 22nd national conference on Artificial intelligence - Volume 1
Hierarchical reinforcement learning with the MAXQ value function decomposition

Journal of Artificial Intelligence Research
Optimizing dialogue management with reinforcement learning: experiments with the NJFun system

Journal of Artificial Intelligence Research
Efficient reinforcement learning using recursive least-squares methods

Journal of Artificial Intelligence Research
Potential-based shaping and Q-value initialization are equivalent

Journal of Artificial Intelligence Research
Efficient solution algorithms for factored MDPs

Journal of Artificial Intelligence Research
Accelerating reinforcement learning through implicit imitation

Journal of Artificial Intelligence Research
Risk-sensitive reinforcement learning applied to control under constraints

Journal of Artificial Intelligence Research
Perseus: randomized point-based value iteration for POMDPs

Journal of Artificial Intelligence Research
Approximate policy iteration with a policy language bias: solving relational Markov decision processes

Journal of Artificial Intelligence Research
Solving factored MDPs with hybrid state and action variables

Journal of Artificial Intelligence Research
Anytime point-based approximations for large POMDPs

Journal of Artificial Intelligence Research
Closed-loop learning of visual control policies

Journal of Artificial Intelligence Research
Learning to play using low-complexity rule-based policies: illustrations through Ms. Pac-Man

Journal of Artificial Intelligence Research
Adaptive stochastic resource control: a machine learning approach

Journal of Artificial Intelligence Research
Learning partially observable deterministic action models

Journal of Artificial Intelligence Research
A heuristic search approach to planning with continuous resources in stochastic domains

Journal of Artificial Intelligence Research
AntNet: distributed stigmergetic control for communications networks

Journal of Artificial Intelligence Research
Infinite-horizon policy-gradient estimation

Journal of Artificial Intelligence Research
Experiments with infinite-horizon, policy-gradient estimation

Journal of Artificial Intelligence Research
Sequential optimality and coordination in multiagent systems

IJCAI'99 Proceedings of the 16th international joint conference on Artifical intelligence - Volume 1
Convergence of reinforcement learning with general function approximators

IJCAI'99 Proceedings of the 16th international joint conference on Artificial intelligence - Volume 2
A decision-theoretic model of assistance

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Using learned policies in heuristic-search planning

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Hierarchical heuristic forward search in Stochastic domains

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
An analysis of Laplacian methods for value function approximation in MDPs

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Covariant policy search

IJCAI'03 Proceedings of the 18th international joint conference on Artificial intelligence
Approximate policy iteration using large-margin classifiers

IJCAI'03 Proceedings of the 18th international joint conference on Artificial intelligence
A bilinear programming approach for multiagent planning

Journal of Artificial Intelligence Research
Solving factored MDPs via non-homogeneous partitioning

IJCAI'01 Proceedings of the 17th international joint conference on Artificial intelligence - Volume 1
Symbolic dynamic programming for first-order MDPs

IJCAI'01 Proceedings of the 17th international joint conference on Artificial intelligence - Volume 1
From Q(λ) to average Q-learning: efficient implementation of an asymptotic approximation

IJCAI'01 Proceedings of the 17th international joint conference on Artificial intelligence - Volume 2
Exploiting multiple secondary reinforcers in policy gradient reinforcement learning

IJCAI'01 Proceedings of the 17th international joint conference on Artificial intelligence - Volume 2
Robust planning with (L)RTDP

IJCAI'05 Proceedings of the 19th international joint conference on Artificial intelligence
Probabilistic reasoning for plan robustness

IJCAI'05 Proceedings of the 19th international joint conference on Artificial intelligence
Solving POMDPs with continuous or large discrete observation spaces

IJCAI'05 Proceedings of the 19th international joint conference on Artificial intelligence
An MCMC approach to solving hybrid factored MDPs

IJCAI'05 Proceedings of the 19th international joint conference on Artificial intelligence
On solving the Lagrangian dual of integer programs via an incremental approach

Computational Optimization and Applications
Intensional dynamic programming. A Rosetta stone for structured dynamic programming

Journal of Algorithms
Assured end-to-end QoS through adaptive marking in multi-domain differentiated services networks

Computer Communications
An analytic modelling approach for network routing algorithms that use "ant-like" mobile agents

Computer Networks: The International Journal of Computer and Telecommunications Networking
Theoretical analysis of batch and on-line training for gradient descent learning in neural networks

Neurocomputing
Reinforcement learning versus model predictive control: a comparison on a power system problem

IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics
Reinforcement-learning-based output-feedback control of nonstrict nonlinear discrete-time systems with application to engine emission control

IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics
Neural network output optimization using interval analysis

IEEE Transactions on Neural Networks
Boundedness and convergence of online gradient method with penalty for feedforward neural networks

IEEE Transactions on Neural Networks
Neural-network-based near-optimal control for a class of discrete-time affine nonlinear systems with control constraints

IEEE Transactions on Neural Networks
A Q-learning approach to derive optimal consumption and investment strategies

IEEE Transactions on Neural Networks
A Computational Model of Social-Learning Mechanisms

Adaptive Behavior - Animals, Animats, Software Agents, Robots, Adaptive Systems
Solving POMDPs: RTDP-bel vs. point-based algorithms

IJCAI'09 Proceedings of the 21st international jont conference on Artifical intelligence
ReTrASE: integrating paradigms for approximate probabilistic planning

IJCAI'09 Proceedings of the 21st international jont conference on Artifical intelligence
Training parsers by inverse reinforcement learning

Machine Learning
Reinforcement learning and adaptive dynamic programming for feedback control

IEEE Circuits and Systems Magazine
Customized learning algorithms for episodic tasks withacyclic state spaces

CASE'09 Proceedings of the fifth annual IEEE international conference on Automation science and engineering
Stochastic model for outcome prediction in acute illness

Computers in Biology and Medicine
Online learning in Markov decision processes with arbitrarily changing rewards and transitions

GameNets'09 Proceedings of the First ICST international conference on Game Theory for Networks
An Additive Reinforcement Learning

ICANN '09 Proceedings of the 19th International Conference on Artificial Neural Networks: Part I
Finding Best k Policies

ADT '09 Proceedings of the 1st International Conference on Algorithmic Decision Theory
Reinforcement Learning Based Web Service Compositions for Mobile Business

WISM '09 Proceedings of the International Conference on Web Information Systems and Mining
A reinforcement learning framework for utility-based scheduling in resource-constrained systems

A reinforcement learning framework for utility-based scheduling in resource-constrained systems
A reinforcement learning approach to dynamic resource allocation

A reinforcement learning approach to dynamic resource allocation
Constrained controller design for a class of nonlinear discrete-time uncertain systems

ACC'09 Proceedings of the 2009 conference on American Control Conference
Approximate dynamic programming using Bellman residual elimination and Gaussian process regression

ACC'09 Proceedings of the 2009 conference on American Control Conference
Efficient suboptimal solutions of switched LQR problems

ACC'09 Proceedings of the 2009 conference on American Control Conference
icLQG: combining local and global optimization for control in information space

ICRA'09 Proceedings of the 2009 IEEE international conference on Robotics and Automation
Online actor critic algorithm to solve the continuous-time infinite horizon optimal control problem

IJCNN'09 Proceedings of the 2009 international joint conference on Neural Networks
Adaptive dynamic programming for discrete-time systems with infinite horizon and Ɛ -error bound in the performance cost

IJCNN'09 Proceedings of the 2009 international joint conference on Neural Networks
Generalized policy iteration for continuous-time systems

IJCNN'09 Proceedings of the 2009 international joint conference on Neural Networks
An alpha derivative formulation of the Hamilton-Jacobi-Bellman equation of dynamic programming

IJCNN'09 Proceedings of the 2009 international joint conference on Neural Networks
Derivatives of logarithmic stationary distributions for policy gradient reinforcement learning

Neural Computation
Computational intelligence for structured learning of a partner robot based on imitation

Information Sciences: an International Journal
On the evolution of artificial Tetris players

CIG'09 Proceedings of the 5th international conference on Computational Intelligence and Games
R-POPTVR: a novel reinforcement-based POPTVR fuzzy neural network for pattern classification

IEEE Transactions on Neural Networks
Adaptive dynamic programming: an introduction

IEEE Computational Intelligence Magazine
Percentile Optimization for Markov Decision Processes with Parameter Uncertainty

Operations Research
Restless watchdog: selective quickest spectrum sensing in multichannel cognitive radio systems

EURASIP Journal on Advances in Signal Processing - Special issue on dynamic spectrum access for wireless networking
The improvement of Q-learning applied to imperfect information game

SMC'09 Proceedings of the 2009 IEEE international conference on Systems, Man and Cybernetics
Feature Article---Merging AI and OR to Solve High-Dimensional Stochastic Optimization Problems Using Approximate Dynamic Programming

INFORMS Journal on Computing
Rejoinder---The Languages of Stochastic Optimization

INFORMS Journal on Computing
Approximate dynamic programming techniques for the control of time-varying queuing systems applied to call centers with abandonments and retrials

Probability in the Engineering and Informational Sciences
A framework for the design of a military operational supply network

CISDA'09 Proceedings of the Second IEEE international conference on Computational intelligence for security and defense applications
QoS differentiated and fair packet scheduling in broadband wireless access networks

EURASIP Journal on Wireless Communications and Networking - Special issue on broadband wireless access
Single-pass and approximate dynamic-programming algorithms for order acceptance and capacity planning

Journal of Heuristics
Online adaptive policies for ensemble classifiers

Neurocomputing
Improving iterative repair strategies for scheduling with the SVM

Neurocomputing
Reinforcement learning combined with a fuzzy adaptive learning control network (FALCON-R) for pattern classification

Pattern Recognition
Review article: Synergizing reinforcement learning and game theory-A new direction for control

Applied Soft Computing
Fuzzy decision tree function approximation in reinforcement learning

International Journal of Artificial Intelligence and Soft Computing
RL-Glue: Language-Independent Software for Reinforcement-Learning Experiments

The Journal of Machine Learning Research
On the connections between PCTL and dynamic programming

Proceedings of the 13th ACM international conference on Hybrid systems: computation and control
A Convergent Online Single Time Scale Actor Critic Algorithm

The Journal of Machine Learning Research
Reinforcement learning as a means of dynamic aggregate QoS provisioning

Art-QoS'03 Proceedings of the 2003 international conference on Architectures for quality of service in the internet
Computational approaches to reachability analysis of stochastic hybrid systems

HSCC'07 Proceedings of the 10th international conference on Hybrid systems: computation and control
Parallelizing parallel rollout algorithm for solving Markov decision processes

WOMPAT'03 Proceedings of the OpenMP applications and tools 2003 international conference on OpenMP shared memory parallel programming
A POMDP approximation algorithm that anticipates the need to observe

PRICAI'00 Proceedings of the 6th Pacific Rim international conference on Artificial intelligence
Near sets: toward approximation space-based object recognition

RSKT'07 Proceedings of the 2nd international conference on Rough sets and knowledge technology
Learning autonomous behaviours for non-holonomic vehicles

IWANN'07 Proceedings of the 9th international work conference on Artificial neural networks
Q-learning with linear function approximation

COLT'07 Proceedings of the 20th annual conference on Learning theory
Learning models of relational MDPs using graph kernels

MICAI'07 Proceedings of the artificial intelligence 6th Mexican international conference on Advances in artificial intelligence
Solutions to real-world instances of PSPACE-complete stacking

ESA'07 Proceedings of the 15th annual European conference on Algorithms
Validation of a reinforcement learning policy for dosage optimization of erythropoietin

AI'07 Proceedings of the 20th Australian joint conference on Advances in artificial intelligence
Rollout strategy-based probabilistic causal model approach for the multiple fault diagnosis

Robotics and Computer-Integrated Manufacturing
Using control theory for analysis of reinforcement learning and optimal policy properties in grid-world problems

ICIC'09 Proceedings of the Intelligent computing 5th international conference on Emerging intelligent computing technology and applications
Approximate Dynamic Programming for Ambulance Redeployment

INFORMS Journal on Computing
Feature discovery in reinforcement learning using genetic programming

EuroGP'08 Proceedings of the 11th European conference on Genetic programming
A relational hierarchical model for decision-theoretic assistance

ILP'07 Proceedings of the 17th international conference on Inductive logic programming
Relational reinforcement learning for agents in worlds with objects

Adaptive agents and multi-agent systems
Reward-modulated hebbian learning of decision making

Neural Computation
Improving optimistic exploration in model-free reinforcement learning

ICANNGA'09 Proceedings of the 9th international conference on Adaptive and natural computing algorithms
Bounds for multistage stochastic programs using supervised learning strategies

SAGA'09 Proceedings of the 5th international conference on Stochastic algorithms: foundations and applications
An Approximate Dynamic Programming Approach to Benchmark Practice-Based Heuristics for Natural Gas Storage Valuation

Operations Research
Interaction of culture-based learning and cooperative co-evolution and its application to automatic behavior-based system design

IEEE Transactions on Evolutionary Computation
Joint connection admission control and routing in IEEE 802.16-based mesh networks

IEEE Transactions on Wireless Communications
A general framework to detect unsafe system states from multisensor data stream

IEEE Transactions on Intelligent Transportation Systems
Error Bounds for Approximations from Projected Linear Equations

Mathematics of Operations Research
Linearly Parameterized Bandits

Mathematics of Operations Research
PAC-MDP learning with knowledge-based admissible models

Proceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems: volume 1 - Volume 1
Cultivating desired behaviour: policy teaching via environment-dynamics tweaks

Proceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems: volume 1 - Volume 1
Training a Tetris agent via interactive shaping: a demonstration of the TAMER framework

Proceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems: volume 1 - Volume 1
Information Relaxations and Duality in Stochastic Dynamic Programs

Operations Research
Coordinated learning in multiagent MDPs with infinite state-space

Autonomous Agents and Multi-Agent Systems
Opportunistic Fair Scheduling in Wireless Networks: An Approximate Dynamic Programming Approach

Mobile Networks and Applications
A stochastic approximation method with max-norm projections and its applications to the Q-learning algorithm

ACM Transactions on Modeling and Computer Simulation (TOMACS)
Management of water resource systems in the presence of uncertainties by nonlinear approximation techniques and deterministic sampling

Computational Optimization and Applications
Control of unknown nonlinear systems with efficient transient performance using concurrent exploitation and exploration

IEEE Transactions on Neural Networks
Functional Optimization Through Semilocal Approximate Minimization

Operations Research
Steady-state genetic algorithms for growing topological mapping and localization

PRICAI'10 Proceedings of the 11th Pacific Rim international conference on Trends in artificial intelligence
Adaptive bases for reinforcement learning

ECML PKDD'10 Proceedings of the 2010 European conference on Machine learning and knowledge discovery in databases: Part I
Adaptive critic design with ESN critic for bioprocess optimization

ICANN'10 Proceedings of the 20th international conference on Artificial neural networks: Part II
On the potential of process simulation in software project schedule optimization

COMPSAC-W'05 Proceedings of the 29th annual international conference on Computer software and applications conference
Simultaneous learning of perception and action in mobile robots

Robotics and Autonomous Systems
Minimizing total tardiness in a stochastic single machine scheduling problem using approximate dynamic programming

Journal of Scheduling
Automatic induction of bellman-error features for probabilistic planning

Journal of Artificial Intelligence Research
Pagerank optimization in polynomial time by stochastic shortest path reformulation

ALT'10 Proceedings of the 21st international conference on Algorithmic learning theory
Combining reinforcement learning with symbolic planning

ALAMAS'05/ALAMAS'06/ALAMAS'07 Proceedings of the 5th , 6th and 7th European conference on Adaptive and learning agents and multi-agent systems: adaptation and multi-agent learning
Adaptive traffic signal control using vehicle-to-infrastructure communication: a technical note

Proceedings of the Second International Workshop on Computational Transportation Science
Ranking policies in discrete Markov decision processes

Annals of Mathematics and Artificial Intelligence
Stochastic approximation algorithms for constrained optimization via simulation

ACM Transactions on Modeling and Computer Simulation (TOMACS)
Stochastic control via direct comparison

Discrete Event Dynamic Systems
An Improved Dynamic Programming Decomposition Approach for Network Revenue Management

Manufacturing & Service Operations Management
Continuous state/action reinforcement learning: A growing self-organizing map approach

Neurocomputing
Learning visual representations for perception-action systems

International Journal of Robotics Research
Kalman temporal differences

Journal of Artificial Intelligence Research
A CMOS current-mode dynamic programming circuit

IEEE Transactions on Circuits and Systems Part I: Regular Papers - Special section on 2009 IEEE system-on-chip conference
Hessian matrix distribution for Bayesian policy gradient reinforcement learning

Information Sciences: an International Journal
Sampled fictitious play for approximate dynamic programming

Computers and Operations Research
Adaptive modulation with smoothed flow utility

EURASIP Journal on Wireless Communications and Networking
Decentralized MDPs with sparse interactions

Artificial Intelligence
Reinforcement learning for model building and variance-penalized control

Winter Simulation Conference
A simulation-based approximate dynamic programming approach for the control of the Intel Mini-Fab benchmark model

Winter Simulation Conference
Ambulance redeployment: an approximate dynamic programming approach

Winter Simulation Conference
Declarative programming for agent applications

Autonomous Agents and Multi-Agent Systems
Optimization of heuristic search using recursive algorithm selection and reinforcement learning

Annals of Mathematics and Artificial Intelligence
A dynamic programming strategy to balance exploration and exploitation in the bandit problem

Annals of Mathematics and Artificial Intelligence
Decentralized activation in sensor networks - global games and adaptive filtering games

Digital Signal Processing
Learning to manage combined energy supply systems

Proceedings of the 17th IEEE/ACM international symposium on Low-power electronics and design
Generalized TD Learning

The Journal of Machine Learning Research
Reinforcement learning and the Bayesian control rule

AGI'11 Proceedings of the 4th international conference on Artificial general intelligence
Efficient planning in R-max

The 10th International Conference on Autonomous Agents and Multiagent Systems - Volume 3
Network Cargo Capacity Management

Operations Research
The Effect of Robust Decisions on the Cost of Uncertainty in Military Airlift Operations

ACM Transactions on Modeling and Computer Simulation (TOMACS)
Computing Bid Prices for Revenue Management Under Customer Choice Behavior

Manufacturing & Service Operations Management
A framework and a mean-field algorithm for the local control of spatial processes

International Journal of Approximate Reasoning
Self-teaching adaptive dynamic programming for Gomoku

Neurocomputing
Finite-horizon neuro-optimal tracking control for a class of discrete-time nonlinear systems using adaptive dynamic programming approach

Neurocomputing
Learning finite-state controllers for partially observable environments

UAI'99 Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence
Planning under continuous time and resource uncertainty: a challenge for AI

UAI'02 Proceedings of the Eighteenth conference on Uncertainty in artificial intelligence
The thing that we tried didn't work very well: deictic representation in reinforcement learning

UAI'02 Proceedings of the Eighteenth conference on Uncertainty in artificial intelligence
Value function approximation in zero-sum markov games

UAI'02 Proceedings of the Eighteenth conference on Uncertainty in artificial intelligence
Policy iteration for factored MDPs

UAI'00 Proceedings of the Sixteenth conference on Uncertainty in artificial intelligence
Robust combination of local controllers

UAI'01 Proceedings of the Seventeenth conference on Uncertainty in artificial intelligence
Structured reachability analysis for Markov decision processes

UAI'98 Proceedings of the Fourteenth conference on Uncertainty in artificial intelligence
Hierarchical solution of Markov decision processes using macro-actions

UAI'98 Proceedings of the Fourteenth conference on Uncertainty in artificial intelligence
Hierarchical Knowledge Gradient for Sequential Sampling

The Journal of Machine Learning Research
Robust Approximate Bilinear Programming for Value Function Approximation

The Journal of Machine Learning Research
Consistency of Sequential Bayesian Sampling Policies

SIAM Journal on Control and Optimization
Theory and Applications of Robust Optimization

SIAM Review
Quantum reinforcement learning

ICNC'05 Proceedings of the First international conference on Advances in Natural Computation - Volume Part II
Stochastic reactive production scheduling by multi-agent based asynchronous approximate dynamic programming

CEEMAS'05 Proceedings of the 4th international Central and Eastern European conference on Multi-Agent Systems and Applications
General discounting versus average reward

ALT'06 Proceedings of the 17th international conference on Algorithmic Learning Theory
Probabilistic generalization of simple grammars and its application to reinforcement learning

ALT'06 Proceedings of the 17th international conference on Algorithmic Learning Theory
Approximate policy iteration for closed-loop learning of visual tasks

ECML'06 Proceedings of the 17th European conference on Machine Learning
Task-Driven discretization of the joint space of visual percepts and continuous actions

ECML'06 Proceedings of the 17th European conference on Machine Learning
Solving uncertain markov decision problems: an interval-based method

ICNC'06 Proceedings of the Second international conference on Advances in Natural Computation - Volume Part II
Load protection model based on intelligent agent regulation

KES'06 Proceedings of the 10th international conference on Knowledge-Based Intelligent Information and Engineering Systems - Volume Part I
Symbolic generalization for on-line planning

UAI'03 Proceedings of the Nineteenth conference on Uncertainty in Artificial Intelligence
Adaptive utility-based scheduling in resource-constrained systems

AI'05 Proceedings of the 18th Australian Joint conference on Advances in Artificial Intelligence
Learning near-optimal policies with bellman-residual minimization based fitted policy iteration and a single sample path

COLT'06 Proceedings of the 19th annual conference on Learning Theory
A machine learning approach to intraday trading on foreign exchange markets

IDEAL'05 Proceedings of the 6th international conference on Intelligent Data Engineering and Automated Learning
Scheduling of re-entrant lines with neuro-dynamic programming based on a new evaluating criterion

ISNN'06 Proceedings of the Third international conference on Advances in Neural Networks - Volume Part III
Aspects of optimal viewpoint selection and viewpoint fusion

ACCV'06 Proceedings of the 7th Asian conference on Computer Vision - Volume Part II
Grey reinforcement learning for incomplete information processing

TAMC'06 Proceedings of the Third international conference on Theory and Applications of Models of Computation
Optimal tuning of continual online exploration in reinforcement learning

ICANN'06 Proceedings of the 16th international conference on Artificial Neural Networks - Volume Part I
Feature extraction for decision-theoretic planning in partially observable environments

ICANN'06 Proceedings of the 16th international conference on Artificial Neural Networks - Volume Part I
Reinforcement learning with echo state networks

ICANN'06 Proceedings of the 16th international conference on Artificial Neural Networks - Volume Part I
Towards finite-sample convergence of direct reinforcement learning

ECML'05 Proceedings of the 16th European conference on Machine Learning
Natural actor-critic

ECML'05 Proceedings of the 16th European conference on Machine Learning
Feature-Discovering approximate value iteration methods

SARA'05 Proceedings of the 6th international conference on Abstraction, Reformulation and Approximation
CBR for state value function approximation in reinforcement learning

ICCBR'05 Proceedings of the 6th international conference on Case-Based Reasoning Research and Development
Optimal motion planning by reinforcement learning in autonomous mobile vehicles

Robotica
Reduced-State SARSA featuring extended channel reassignment for dynamic channel allocation in mobile cellular networks

ICN'05 Proceedings of the 4th international conference on Networking - Volume Part II
The effect of alteration in service environments with distributed intelligent agents

KES'05 Proceedings of the 9th international conference on Knowledge-Based Intelligent Information and Engineering Systems - Volume Part III
Q-Learning and Enhanced Policy Iteration in Discounted Dynamic Programming

Mathematics of Operations Research
Multi-agent case-based reasoning for cooperative reinforcement learners

ECCBR'06 Proceedings of the 8th European conference on Advances in Case-Based Reasoning
The K best-paths approach to approximate dynamic programming with application to portfolio optimization

AI'06 Proceedings of the 19th international conference on Advances in Artificial Intelligence: Canadian Society for Computational Studies of Intelligence
Adaptive opportunistic routing for wireless ad hoc networks

IEEE/ACM Transactions on Networking (TON)
Admission control policies for a multi-class QoS-aware service oriented architecture

ACM SIGMETRICS Performance Evaluation Review
Basis function discovery using spectral clustering and bisimulation metrics

ALA'11 Proceedings of the 11th international conference on Adaptive and Learning Agents
Topological value iteration algorithms

Journal of Artificial Intelligence Research
Stochastic enforced hill-climbing

Journal of Artificial Intelligence Research
Brief paper: Average cost temporal-difference learning

Automatica (Journal of IFAC)
Monte Carlo TD(λ)-methods for the optimal control of discrete-time Markovian jump linear systems

Automatica (Journal of IFAC)
Book review

Automatica (Journal of IFAC)
Book reviews: Self-learning control of finite Markov chains

Automatica (Journal of IFAC)
A time aggregation approach to Markov decision processes

Automatica (Journal of IFAC)
Book review: Stochastic controls-Hamiltonian systems and HJB equations

Automatica (Journal of IFAC)
Distributionally Robust Markov Decision Processes

Mathematics of Operations Research
DetH: approximate hierarchical solution of large Markov decision processes

IJCAI'11 Proceedings of the Twenty-Second international joint conference on Artificial Intelligence - Volume Volume Three
Automatic construction of efficient multiple battery usage policies

IJCAI'11 Proceedings of the Twenty-Second international joint conference on Artificial Intelligence - Volume Volume Three
Discovering hidden structure in factored MDPs

Artificial Intelligence
Adaptive and Stochastic Algorithms for Electrical Impedance Tomography and DC Resistivity Problems with Piecewise Constant Solutions and Many Measurements

SIAM Journal on Scientific Computing
Approximate Dynamic Programming via a Smoothed Linear Program

Operations Research
Feature reinforcement learning in practice

EWRL'11 Proceedings of the 9th European conference on Recent Advances in Reinforcement Learning
ℓ1-Penalized projected bellman residual

EWRL'11 Proceedings of the 9th European conference on Recent Advances in Reinforcement Learning
Recursive least-squares learning with eligibility traces

EWRL'11 Proceedings of the 9th European conference on Recent Advances in Reinforcement Learning
MapReduce for parallel reinforcement learning

EWRL'11 Proceedings of the 9th European conference on Recent Advances in Reinforcement Learning
Integrating a partial model into model free reinforcement learning

The Journal of Machine Learning Research
Multi-rate control policies for elastic traffic in CDMA networks

Performance Evaluation
A rapid sparsification method for kernel machines in approximate policy iteration

ISNN'12 Proceedings of the 9th international conference on Advances in Neural Networks - Volume Part I
Computational properties of cyclic and almost-cyclic learning with momentum for feedforward neural networks

ISNN'12 Proceedings of the 9th international conference on Advances in Neural Networks - Volume Part I
Proximity-based non-uniform abstractions for approximate planning

Journal of Artificial Intelligence Research
Plan-based policies for efficient multiple battery load management

Journal of Artificial Intelligence Research
An iterative adaptive dynamic programming algorithm for optimal control of unknown discrete-time nonlinear systems with constrained inputs

Information Sciences: an International Journal
SMART: A Stochastic Multiscale Model for the Analysis of Energy Resources, Technology, and Policy

INFORMS Journal on Computing
Bayesian nonparametric inverse reinforcement learning

ECML PKDD'12 Proceedings of the 2012 European conference on Machine Learning and Knowledge Discovery in Databases - Volume Part II
Learning policies for battery usage optimization in electric vehicles

ECML PKDD'12 Proceedings of the 2012 European conference on Machine Learning and Knowledge Discovery in Databases - Volume Part II
Bayesian Learning of Noisy Markov Decision Processes

ACM Transactions on Modeling and Computer Simulation (TOMACS) - Special Issue on Monte Carlo Methods in Statistics
Upper confidence tree-based consistent reactive planning application to minesweeper

LION'12 Proceedings of the 6th international conference on Learning and Intelligent Optimization
Improving the exploration in upper confidence trees

LION'12 Proceedings of the 6th international conference on Learning and Intelligent Optimization
Adaptive value function approximation for continuous-state stochastic dynamic programming

Computers and Operations Research
Scheduling fighter aircraft maintenance with reinforcement learning

Proceedings of the Winter Simulation Conference
Stochastic policy search for variance-penalized semi-Markov control

Proceedings of the Winter Simulation Conference
Using approximate dynamic programming to optimize admission control in cloud computing environment

Proceedings of the Winter Simulation Conference
A sampled fictitious play based learning algorithm for infinite horizon Markov decision processes

Proceedings of the Winter Simulation Conference
Performance Guarantees for Empirical Markov Decision Processes with Applications to Multiperiod Inventory Models

Operations Research
Identifying effective policies in approximate dynamic programming: beyond regression

Proceedings of the Winter Simulation Conference
American option pricing with randomized quasi-Monte Carlo simulations

Proceedings of the Winter Simulation Conference
Reducing the learning time of tetris in evolution strategies

EA'11 Proceedings of the 10th international conference on Artificial Evolution
Two-step gradient-based reinforcement learning for underwater robotics behavior learning

Robotics and Autonomous Systems
An Actor-Critic based controller for glucose regulation in type 1 diabetes

Computer Methods and Programs in Biomedicine
Sourcing strategies in supply risk management: An approximate dynamic programming approach

Computers and Operations Research
Dynamic Capacity Allocation to Customers Who Remember Past Service

Management Science
Accelerating a Recurrent Neural Network to Finite-Time Convergence for Solving Time-Varying Sylvester Equation by Using a Sign-Bi-power Activation Function

Neural Processing Letters
An efficient L2-norm regularized least-squares temporal difference learning algorithm

Knowledge-Based Systems
Neural-network-based zero-sum game for discrete-time nonlinear systems via iterative adaptive dynamic programming algorithm

Neurocomputing
Assessing the Value of Dynamic Pricing in Network Revenue Management

INFORMS Journal on Computing
Design with shape grammars and reinforcement learning

Advanced Engineering Informatics
Using informative behavior to increase engagement in the tamer framework

Proceedings of the 2013 international conference on Autonomous agents and multi-agent systems
On Boundedness of Q-Learning Iterates for Stochastic Shortest Path Problems

Mathematics of Operations Research
Performance bounds for λ policy iteration and application to the game of Tetris

The Journal of Machine Learning Research
Finite-sample analysis of least-squares policy iteration

The Journal of Machine Learning Research
Dynamic policy programming

The Journal of Machine Learning Research
Minimax PAC bounds on the sample complexity of reinforcement learning with a generative model

Machine Learning
Learning policies for battery usage optimization in electric vehicles

Machine Learning
Survey Combining accuracy and success-rate to improve the performance of eXtended Classifier System (XCS) for data-mining and control applications

Engineering Applications of Artificial Intelligence
Planning for multiple measurement channels in a continuous-state POMDP

Annals of Mathematics and Artificial Intelligence
Scenario Trees and Policy Selection for Multistage Stochastic Programming Using Machine Learning

INFORMS Journal on Computing
Probabilistic planning for continuous dynamic systems under bounded risk

Journal of Artificial Intelligence Research
A novel reinforcement learning architecture for continuous state and action spaces

Advances in Artificial Intelligence
Neuro-optimal control for a class of unknown nonlinear dynamic systems using SN-DHP technique

Neurocomputing
Integrated task and motion planning in belief space

International Journal of Robotics Research
Full-range adaptive cruise control based on supervised adaptive dynamic programming

Neurocomputing
Low-discrepancy sampling for approximate dynamic programming with local approximators

Computers and Operations Research
Reinforcement learning algorithms with function approximation: Recent advances and applications

Information Sciences: an International Journal
General time consistent discounting

Theoretical Computer Science
Construction of approximation spaces for reinforcement learning

The Journal of Machine Learning Research
Unbiased consensus in wireless networks via collisional random broadcast and its application on distributed optimization

Signal Processing
Multiagent meta-level control for radar coordination

Web Intelligence and Agent Systems
A tour of machine learning: An AI perspective

AI Communications - ECAI 2012 Turing and Anniversary Track

Quantified Score

Hi-index	0.02

Visualization

Abstract

From the Publisher:This is the first textbook that fully explains the neuro-dynamic programming/reinforcement learning methodology, which is a recent breakthrough in the practical application of neural networks and dynamic programming to complex problems of planning, optimal decision making, and intelligent control.