Markov Decision Processes: Discrete Stochastic Dynamic Programming

Authors:
Martin L. Puterman
Affiliations:
-
Venue:
Markov Decision Processes: Discrete Stochastic Dynamic Programming
Year:
1994

Citing 0
Cited 780

Integrating communicative action, conversations and decision theory to coordinate agents

AGENTS '97 Proceedings of the first international conference on Autonomous agents
Optimal policies for handoff and channel assignment in networks of LEO satellites using CDMA

Wireless Networks - Special issue: hybrid and satellite communication networks
Solving very large weakly coupled Markov decision processes

AAAI '98/IAAI '98 Proceedings of the fifteenth national/tenth conference on Artificial intelligence/Innovative applications of artificial intelligence
Exploration of Multi-State Environments: Local Measures and Back-Propagation of Uncertainty

Machine Learning
Improving call admission policies in wireless networks

Wireless Networks
SMG: a new simulation/optimization approach for large-scale problems

Proceedings of the 31st conference on Winter simulation: Simulation---a bridge to the future - Volume 1
Distributed reinforcement learning for a traffic engineering application

AGENTS '00 Proceedings of the fourth international conference on Autonomous agents
Convergence Results for Single-Step On-PolicyReinforcement-Learning Algorithms

Machine Learning
A decision-theoretic approach to resource allocation in wireless multimedia networks

DIALM '00 Proceedings of the 4th international workshop on Discrete algorithms and methods for mobile computing and communications
A Study of Reinforcement Learning in the Continuous Case by the Means of Viscosity Solutions

Machine Learning
Improved results for route planning in stochastic transportation

SODA '01 Proceedings of the twelfth annual ACM-SIAM symposium on Discrete algorithms
A reinforcement learning model of selective visual attention

Proceedings of the fifth international conference on Autonomous agents
A dynamic mechanism for time-constrained trading

Proceedings of the fifth international conference on Autonomous agents
Stochastic control of path optimization for inter-switch handoffs in wireless ATM networks

IEEE/ACM Transactions on Networking (TON)
Optimal structured feedback policies for ABR flow control using two-timescale SPSA

IEEE/ACM Transactions on Networking (TON)
Reinforcement learning for fuzzy agents: application to a pighouse environment control

New learning paradigms in soft computing
A multiagent reinforcement learning algorithm using extended optimal response

Proceedings of the first international joint conference on Autonomous agents and multiagent systems: part 1
Multi-agent policies: from centralized ones to decentralized ones

Proceedings of the first international joint conference on Autonomous agents and multiagent systems: part 3
Optimal plans for aggregation

Proceedings of the twenty-first annual symposium on Principles of distributed computing
Optimal policy for label switched path setup in MPLS networks

Computer Networks: The International Journal of Computer and Telecommunications Networking
Optimal resource allocation in multi-class networks with user-specified utility functions

Computer Networks: The International Journal of Computer and Telecommunications Networking
Planning and Control in Artificial Intelligence: A Unifying Perspective

Applied Intelligence
The Relations Among Potentials, Perturbation Analysis,and Markov Decision Processes

Discrete Event Dynamic Systems
On an Optimization Problem in Sensor Selection

Discrete Event Dynamic Systems
Kernel-Based Reinforcement Learning

Machine Learning
Near-Optimal Reinforcement Learning in Polynomial Time

Machine Learning
Variable Resolution Discretization in Optimal Control

Machine Learning
Linear waste of best fit bin packing on skewed distributions

Random Structures & Algorithms - Probabilistic methods in combinatorial optimization
Optimal policy for label switched path setup in MPLS Networks

Computer Networks: The International Journal of Computer and Telecommunications Networking
Opponent interactions between serotonin and dopamine

Neural Networks - Computational models of neuromodulation
From Perturbation Analysis to Markov Decision Processes and Reinforcement Learning

Discrete Event Dynamic Systems
Towards Flexible Teamwork in Persistent Teams: Extended Report

Autonomous Agents and Multi-Agent Systems
Game Theory and Decision Theory in Multi-Agent Systems

Autonomous Agents and Multi-Agent Systems
Optimizing hypervideo navigation using a Markov decision process approach

Proceedings of the tenth ACM international conference on Multimedia
Characterizing Markov Decision Processes

ECML '02 Proceedings of the 13th European Conference on Machine Learning
Reduction and Refinement Strategies for Probabilistic Analysis

PAPM-PROBMIV '02 Proceedings of the Second Joint International Workshop on Process Algebra and Probabilistic Methods, Performance Modeling and Verification
MoDeST - A Modelling and Description Language for Stochastic Timed Systems

PAPM-PROBMIV '01 Proceedings of the Joint International Workshop on Process Algebra and Probabilistic Methods, Performance Modeling and Verification
Reachability Analysis of Probabilistic Systems by Successive Refinements

PAPM-PROBMIV '01 Proceedings of the Joint International Workshop on Process Algebra and Probabilistic Methods, Performance Modeling and Verification
Reduction and Refinement Strategies for Probabilistic Analysis

PAPM-PROBMIV '02 Proceedings of the Second Joint International Workshop on Process Algebra and Probabilistic Methods, Performance Modeling and Verification
Learning Options in Reinforcement Learning

Proceedings of the 5th International Symposium on Abstraction, Reformulation and Approximation
TTree: Tree-Based State Generalization with Temporally Abstract Actions

Proceedings of the 5th International Symposium on Abstraction, Reformulation and Approximation
Self-Similar Layered Hidden Markov Models

PKDD '01 Proceedings of the 5th European Conference on Principles of Data Mining and Knowledge Discovery
Modelling Intelligent Behaviour: The Markov Decision Process Approach

IBERAMIA '98 Proceedings of the 6th Ibero-American Conference on AI: Progress in Artificial Intelligence
Variance-Penalized Reinforcement Learning for Risk-Averse Asset Allocation

IDEAL '00 Proceedings of the Second International Conference on Intelligent Data Engineering and Automated Learning, Data Mining, Financial Engineering, and Intelligent Agents
Game Theory and Artificial Intelligence

Selected papers from the UKMAS Workshop on Foundations and Applications of Multi-Agent Systems
Abstraction of Expectation Functions Using Gaussian Distributions

VMCAI 2003 Proceedings of the 4th International Conference on Verification, Model Checking, and Abstract Interpretation
Karlsruhe Brainstormers - A Reinforcement Learning Approach to Robotic Soccer

RoboCup 2001: Robot Soccer World Cup V
Performance Evaluation: = (Process Algebra + Model Checking) × Markov Chains

CONCUR '01 Proceedings of the 12th International Conference on Concurrency Theory
Simulation for Continuous-Time Markov Chains

CONCUR '02 Proceedings of the 13th International Conference on Concurrency Theory
A Probabilistic Extension of UML Statecharts

FTRTFT '02 Proceedings of the 7th International Symposium on Formal Techniques in Real-Time and Fault-Tolerant Systems: Co-sponsored by IFIP WG 2.2
LC-Learning: Phased Method for Average Reward Reinforcement Learning - Preliminary Results

PRICAI '02 Proceedings of the 7th Pacific Rim International Conference on Artificial Intelligence: Trends in Artificial Intelligence
Hidden-Mode Markov Decision Processes for Nonstationary Sequential Decision Making

Sequence Learning - Paradigms, Algorithms, and Applications
Decision-Theoretic Control of Planetary Rovers

Revised Papers from the International Seminar on Advances in Plan-Based Control of Robotic Agents,
Towards Stochastic Constraint Programming: A Study of Online Multi-choice Knapsack with Deadlines

CP '01 Proceedings of the 7th International Conference on Principles and Practice of Constraint Programming
Face Recognition Using Foveal Vision

BMVC '00 Proceedings of the First IEEE International Workshop on Biologically Motivated Computer Vision
Learning Rates for Q-Learning

COLT '01/EuroCOLT '01 Proceedings of the 14th Annual Conference on Computational Learning Theory and and 5th European Conference on Computational Learning Theory
Agent System for Load Monitoring of the Heterogeneous Computer Network

PPAM '01 Proceedings of the th International Conference on Parallel Processing and Applied Mathematics-Revised Papers
LC-Learning: Phased Method for Average Reward Reinforcement Learning - Analysis of Optimal Criteria

PRICAI '02 Proceedings of the 7th Pacific Rim International Conference on Artificial Intelligence: Trends in Artificial Intelligence
Nearly deterministic abstractions of Markov decision processes

Eighteenth national conference on Artificial intelligence
Greedy linear value-approximation for factored Markov decision processes

Eighteenth national conference on Artificial intelligence
Piecewise linear value function approximation for factored MDPs

Eighteenth national conference on Artificial intelligence
Two Formal Analys s of Attack Graphs

CSFW '02 Proceedings of the 15th IEEE workshop on Computer Security Foundations
Solving factored MDPs using non-homogeneous partitions

Artificial Intelligence - special issue on planning with uncertainty and incomplete information
Learning evaluation functions to improve optimization by local search

The Journal of Machine Learning Research
ε-mdps: learning in varying environments

The Journal of Machine Learning Research
Index Heuristics for Multiclass M/G/1 Systems with Nonpreemptive Service and Convex Holding Costs

Queueing Systems: Theory and Applications
State of the art on automatic road extraction for GIS update: a novel classification

Pattern Recognition Letters
Nash q-learning for general-sum stochastic games

The Journal of Machine Learning Research
Quality Control for Scalable Media Processing Applications

Journal of Scheduling
A call admission control scheme using genetic algorithms

Proceedings of the 2004 ACM symposium on Applied computing
Should Start-up Companies Be Cautious? Inventory Policies Which Maximise Survival Probabilities

Management Science
Gatekeepers and Referrals in Services

Management Science
Bridging the gap between planning and scheduling

The Knowledge Engineering Review
Probabilistic weak simulation is decidable in polynomial time

Information Processing Letters
CONVERGENCE OF SIMULATION-BASED POLICY ITERATION

Probability in the Engineering and Informational Sciences
BOUNDS AND PERFORMANCE LIMITS OF CHANNEL ASSIGNMENT POLICIES IN CELLULAR NETWORKS

Probability in the Engineering and Informational Sciences
Quantitative stochastic parity games

SODA '04 Proceedings of the fifteenth annual ACM-SIAM symposium on Discrete algorithms
COMPUTING AVERAGE OPTIMAL CONSTRAINED POLICIES IN STOCHASTIC DYNAMIC PROGRAMMING

Probability in the Engineering and Informational Sciences
BIAS OPTIMALITY IN A QUEUE WITH ADMISSION CONTROL

Probability in the Engineering and Informational Sciences
OPTIMALITY OF CONTROL LIMIT MAINTENANCE POLICIES UNDER NONSTATIONARY DETERIORATION

Probability in the Engineering and Informational Sciences
THE VALUE OF INFORMATION SHARING IN A TWO-STAGE SUPPLY CHAIN WITH PRODUCTION CAPACITY CONSTRAINTS: THE INFINITE HORIZON CASE

Probability in the Engineering and Informational Sciences
A weakly monotonic backward induction algorithm on finite bounded subsets of vector lattices

Journal of Computational and Applied Mathematics - Special Issue: Proceedings of the 10th international congress on computational and applied mathematics (ICCAM-2002)
Optimal Control of Queueing Systems with Heterogeneous Servers

Queueing Systems: Theory and Applications
Parallel Rollout for Online Solution of Partially Observable Markov Decision Processes

Discrete Event Dynamic Systems
Online pricing for bandwidth provisioning in multi-class networks

Computer Networks: The International Journal of Computer and Telecommunications Networking
Learning Rates for Q-learning

The Journal of Machine Learning Research
Optimal strategies for testing nondeterministic systems

ISSTA '04 Proceedings of the 2004 ACM SIGSOFT international symposium on Software testing and analysis
Optimal Pricing and Admission Control in a Queueing System with Periodically Varying Parameters

Queueing Systems: Theory and Applications
P3VI: a partitioned, prioritized, parallel value iterator

ICML '04 Proceedings of the twenty-first international conference on Machine learning
A Human–Robot Cooperative Learning System for Easy Installation of Assistant Robots in New Working Environments

Journal of Intelligent and Robotic Systems
Modeling correlations in web traces and implications for designing replacement policies

Computer Networks: The International Journal of Computer and Telecommunications Networking
Hierarchical Reinforcement Learning in Communication-Mediated Multiagent Coordination

AAMAS '04 Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems - Volume 3
Fitting and Compilation of Multiagent Models through Piecewise Linear Functions

AAMAS '04 Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems - Volume 3
Planning and programming with first-order markov decision processes: insights and challenges

TARK '01 Proceedings of the 8th conference on Theoretical aspects of rationality and knowledge
Policy iteration type algorithms for recurrent state Markov decision processes

Computers and Operations Research
Planning, learning and coordination in multiagent decision processes

TARK '96 Proceedings of the 6th conference on Theoretical aspects of rationality and knowledge
An Algorithmic Approach for Sensitivity Analysis of Perturbed Quasi-Birth-and-Death Processes

Queueing Systems: Theory and Applications
Routing Of Airplanes To Two Runways: Monotonicity Of Optimal Controls

Probability in the Engineering and Informational Sciences
Efficient QoS Provisioning for Adaptive Multimedia in Mobile Communication Networks by Reinforcement Learning

BROADNETS '04 Proceedings of the First International Conference on Broadband Networks
An empirical study of policy convergence in Markov decision process value iteration

Computers and Operations Research
Optimality of Four-Threshold Policies in Inventory Systems with Customer Returns and Borrowing/Storage Options

Probability in the Engineering and Informational Sciences
Metrics for finite Markov decision processes

UAI '04 Proceedings of the 20th conference on Uncertainty in artificial intelligence
Discretized approximations for POMDP with average cost

UAI '04 Proceedings of the 20th conference on Uncertainty in artificial intelligence
Variance Reduction Techniques for Gradient Estimates in Reinforcement Learning

The Journal of Machine Learning Research
Optimal power and retransmission control policies for random access systems

IEEE/ACM Transactions on Networking (TON)
Basic Ideas for Event-Based Optimization of Markov Systems

Discrete Event Dynamic Systems
Hierarchical Adaptive Dynamic Power Management

IEEE Transactions on Computers
ProbMela and verification of Markov decision processes

ACM SIGMETRICS Performance Evaluation Review
QoS modelling and analysis with UML-statecharts: the StoCharts approach

ACM SIGMETRICS Performance Evaluation Review
QoS Control Strategies for High-Quality Video Processing

Real-Time Systems
A Navigation System for Assistant Robots Using Visually Augmented POMDPs

Autonomous Robots
An approach to handle real time and probabilistic behaviors in e-commerce: validating the SET protocol

Proceedings of the 2005 ACM symposium on Applied computing
Constraint Solving in Uncertain and Dynamic Environments: A Survey

Constraints
Online model-based adaptation for optimizing performance and dependability

WOSS '04 Proceedings of the 1st ACM SIGSOFT workshop on Self-managed systems
Behavior transfer for value-function-based reinforcement learning

Proceedings of the fourth international joint conference on Autonomous agents and multiagent systems
Deliberation in a metadata-based modeling and simulation environment for inter-organizational networks

Information Systems - Special issue: The 15th international conference on advanced information systems engineering (CAiSE 2003)
Comparative branching-time semantics for Markov chains

Information and Computation
A theoretical analysis of Model-Based Interval Estimation

ICML '05 Proceedings of the 22nd international conference on Machine learning
Adaptive Clustering: Obtaining Better Clusters Using Feedback and Past Experience

ICDM '05 Proceedings of the Fifth IEEE International Conference on Data Mining
A Reinforcement Learning Algorithm in Cooperative Multi-Robot Domains

Journal of Intelligent and Robotic Systems
Heterogeneous temporal probabilistic agents

ACM Transactions on Computational Logic (TOCL)
Evaluating strategic options using decision-theoretic planning

Information Technology and Management
Efficient computation of time-bounded reachability probabilities in uniform continuous-time Markov decision processes

Theoretical Computer Science - Tools and algorithms for the construction and analysis of systems (TACAS 2004)
A Unified Analysis of Value-Function-Based Reinforcement Learning Algorithms

Neural Computation
Stochastic Constraint Programming: A Scenario-Based Approach

Constraints
Maintenance Optimization Of Equipment By Linear Programming

Probability in the Engineering and Informational Sciences
The optimal robust control policy for uncertain semi-Markov control processes

International Journal of Systems Science
A short tutorial on reinforcement learning: review and applications

Intelligent information processing II
Maximum margin planning

ICML '06 Proceedings of the 23rd international conference on Machine learning
The parallel Nash Memory for asymmetric games

Proceedings of the 8th annual conference on Genetic and evolutionary computation
Comparing evolutionary and temporal difference methods in a reinforcement learning domain

Proceedings of the 8th annual conference on Genetic and evolutionary computation
Efficient QoS provisioning for adaptive multimedia in mobile communication networks by reinforcement learning

Mobile Networks and Applications - Special issue: Recent advances in wireless networking
Bisimulation and cocongruence for probabilistic systems

Information and Computation - Special issue: Seventh workshop on coalgebraic methods in computer science 2004
On the relationship between MDPs and the BDI architecture

AAMAS '06 Proceedings of the fifth international joint conference on Autonomous agents and multiagent systems
Intelligent handoff management with interference control for next generation wireless systems

Proceedings of the 43rd annual Southeast regional conference - Volume 2
Next generation wireless systems using Markov decision process model

Proceedings of the 43rd annual Southeast regional conference - Volume 2
The Role of Problem Classification in Online Meta-cognition

IAT '06 Proceedings of the IEEE/WIC/ACM international conference on Intelligent Agent Technology
Dynamic routing to heterogeneous collections of unreliable servers

Queueing Systems: Theory and Applications
Efficient approximate planning in continuous space Markovian Decision Problems

AI Communications
A decision support system for direct mailing decisions

Decision Support Systems
Scheduling-free resource management

Data & Knowledge Engineering
Reinforcement learning for dynamic multimedia adaptation

Journal of Network and Computer Applications
Economic metaphors for solving intrafirm allocation problems: what does a market buy us?

Decision Support Systems
Dimensions of complexity of intelligent agents

PCAR '06 Proceedings of the 2006 international symposium on Practical cognitive agents and robots
Production and Inventory Control of a Single Product Assemble-to-Order System with Multiple Customer Classes

Management Science
Determining the Acceptance of Cadaveric Livers Using an Implicit Model of the Waiting List

Operations Research
Brokering strategies in computational grids using stochastic prediction models

Parallel Computing
Online Haggling at a Name-Your-Own-Price Retailer: Theory and Application

Management Science
Production and Inventory Control of a Single Product Assemble-to-Order System with Multiple Customer Classes

Management Science
Restless Bandit Marginal Productivity Indices, Diminishing Returns, and Optimal Control of Make-to-Order/Make-to-Stock M/G/1 Queues

Mathematics of Operations Research
Timing Successive Product Introductions with Demand Diffusion and Stochastic Technology Improvement

Manufacturing & Service Operations Management
DEA: An Architecture for Goal Planning and Classification

Neural Computation
A Price-Directed Approach to Stochastic Inventory/Routing

Operations Research
Optimal Policies for a Capacitated Two-Echelon Inventory System

Operations Research
Robust Control of Markov Decision Processes with Uncertain Transition Matrices

Operations Research
Managing Response Time in a Call-Routing Problem with Service Failure

Operations Research
Anticipatory Route Selection

Transportation Science
Collaborative Multiagent Reinforcement Learning by Payoff Propagation

The Journal of Machine Learning Research
Point-Based Value Iteration for Continuous POMDPs

The Journal of Machine Learning Research
Learning to communicate in a decentralized environment

Autonomous Agents and Multi-Agent Systems
Dynamic power management under uncertain information

Proceedings of the conference on Design, automation and test in Europe
Percentile optimization in uncertain Markov decision processes with application to efficient exploration

Proceedings of the 24th international conference on Machine learning
Multi-armed bandit problems with dependent arms

Proceedings of the 24th international conference on Machine learning
The use of an intelligent prompting system for people with dementia

interactions - Designing for seniors: innovations for graying times
Fuzzy optimality relation for perceptive MDPs---the average case

Fuzzy Sets and Systems
Efficient contention resolution protocols for selfish agents

SODA '07 Proceedings of the eighteenth annual ACM-SIAM symposium on Discrete algorithms
A framework for meta-level control in multi-agent systems

Autonomous Agents and Multi-Agent Systems
Shaping multi-agent systems with gradient reinforcement learning

Autonomous Agents and Multi-Agent Systems
Quantitative verification: models techniques and tools

Proceedings of the the 6th joint meeting of the European software engineering conference and the ACM SIGSOFT symposium on The foundations of software engineering
Series Expansions For Finite-State Markov Chains

Probability in the Engineering and Informational Sciences
Quantitative verification: models, techniques and tools

The 6th Joint Meeting on European software engineering conference and the ACM SIGSOFT symposium on the foundations of software engineering: companion papers
Verifying nondeterministic probabilistic channel systems against ω-regular linear-time properties

ACM Transactions on Computational Logic (TOCL)
Reinforcement learning in multi-agent environment and ant colony for packet scheduling in routers

Proceedings of the 5th ACM international workshop on Mobility management and wireless access
Analysis and optimization of service availability in a HA cluster with load-dependent machine availability

IEEE Transactions on Parallel and Distributed Systems
Formal analysis techniques for gossiping protocols

ACM SIGOPS Operating Systems Review - Gossip-based computer networking
Transfer via inter-task mappings in policy search reinforcement learning

Proceedings of the 6th international joint conference on Autonomous agents and multiagent systems
Commitment-driven distributed joint policy search

Proceedings of the 6th international joint conference on Autonomous agents and multiagent systems
Model-based function approximation in reinforcement learning

Proceedings of the 6th international joint conference on Autonomous agents and multiagent systems
Dynamics based control with an application to area-sweeping problems

Proceedings of the 6th international joint conference on Autonomous agents and multiagent systems
Q-value functions for decentralized POMDPs

Proceedings of the 6th international joint conference on Autonomous agents and multiagent systems
Restricted gradient-descent algorithm for value-function approximation in reinforcement learning

Artificial Intelligence
Sequential Monte Carlo in reachability heuristics for probabilistic planning

Artificial Intelligence
Characterization and computation of restless bandit marginal productivity indices

Proceedings of the 2nd international conference on Performance evaluation methodologies and tools
Continuous State Dynamic Programming via Nonexpansive Approximation

Computational Economics
Dynamic multiagent probabilistic inference

International Journal of Approximate Reasoning
Brief paper: Policy iteration based feedback control

Automatica (Journal of IFAC)
Reachability analysis of uncertain systems using bounded-parameter Markov decision processes

Artificial Intelligence
Joint call admission control algorithms: Requirements, approaches, and design considerations

Computer Communications
A stochastic local hot spot alerting technique

Proceedings of the 2008 Asia and South Pacific Design Automation Conference
Assisting elders via dynamic multi-tasks planning: a Markov decision processes based approach

Proceedings of the 1st international conference on Ambient media and systems
Error bounds of optimization algorithms for semi-Markov decision processes

International Journal of Systems Science
An experimental study of adaptive testing for software reliability assessment

Journal of Systems and Software
The need for an interaction cost model in adaptive interfaces

AVI '08 Proceedings of the working conference on Advanced visual interfaces
An object-oriented representation for efficient reinforcement learning

Proceedings of the 25th international conference on Machine learning
Active reinforcement learning

Proceedings of the 25th international conference on Machine learning
Space-indexed dynamic programming: learning to follow trajectories

Proceedings of the 25th international conference on Machine learning
A worst-case comparison between temporal difference and residual gradient with linear function approximation

Proceedings of the 25th international conference on Machine learning
Apprenticeship learning using linear programming

Proceedings of the 25th international conference on Machine learning
Finite-Time Bounds for Fitted Value Iteration

The Journal of Machine Learning Research
Efficient structured policies for admission control in heterogeneous wireless networks

Mobile Networks and Applications
Stochastic modeling of a thermally-managed multi-core system

Proceedings of the 45th annual Design Automation Conference
Learning Agents in an Artificial Power Exchange: Tacit Collusion, Market Power and Efficiency of Two Double-auction Mechanisms

Computational Economics
Tuning continual exploration in reinforcement learning: An optimality property of the Boltzmann strategy

Neurocomputing
The utility of temporal abstraction in reinforcement learning

Proceedings of the 7th international joint conference on Autonomous agents and multiagent systems - Volume 1
An exact algorithm for solving MDPs under risk-sensitive planning objectives with one-switch utility functions

Proceedings of the 7th international joint conference on Autonomous agents and multiagent systems - Volume 1
Social reward shaping in the prisoner's dilemma

Proceedings of the 7th international joint conference on Autonomous agents and multiagent systems - Volume 3
Resilient dynamic power management under uncertainty

Proceedings of the conference on Design, automation and test in Europe
Navigate like a cabbie: probabilistic reasoning from observed context-aware behavior

UbiComp '08 Proceedings of the 10th international conference on Ubiquitous computing
Reinforcement Learning in Nonstationary Environment Navigation Tasks

CAI '07 Proceedings of the 20th conference of the Canadian Society for Computational Studies of Intelligence on Advances in Artificial Intelligence
An Analysis of Case-Based Value Function Approximation by Approximating State Transition Graphs

ICCBR '07 Proceedings of the 7th international conference on Case-Based Reasoning: Case-Based Reasoning Research and Development
Perception and Developmental Learning of Affordances in Autonomous Robots

KI '07 Proceedings of the 30th annual German conference on Advances in Artificial Intelligence
Options in Readylog Reloaded --- Generating Decision-Theoretic Plan Libraries in Golog

KI '07 Proceedings of the 30th annual German conference on Advances in Artificial Intelligence
Planning and Learning in Environments with Delayed Feedback

ECML '07 Proceedings of the 18th European conference on Machine Learning
Model-Based Reinforcement Learning in a Complex Domain

RoboCup 2007: Robot Soccer World Cup XI
Scheduling for Reliable Execution in Autonomic Systems

ATC '08 Proceedings of the 5th international conference on Autonomic and Trusted Computing
KAF: Kalman Filter Based Adaptive Maintenance for Dependability of Composite Services

CAiSE '08 Proceedings of the 20th international conference on Advanced Information Systems Engineering
Probabilistic CEGAR

CAV '08 Proceedings of the 20th international conference on Computer Aided Verification
Generating Compact MTBDD-Representations from Probmela Specifications

SPIN '08 Proceedings of the 15th international workshop on Model Checking Software
Fitted Natural Actor-Critic: A New Algorithm for Continuous State-Action MDPs

ECML PKDD '08 Proceedings of the European conference on Machine Learning and Knowledge Discovery in Databases - Part II
Transferring Instances for Model-Based Reinforcement Learning

ECML PKDD '08 Proceedings of the European conference on Machine Learning and Knowledge Discovery in Databases - Part II
A Simple Model for Sequences of Relational State Descriptions

ECML PKDD '08 Proceedings of the European conference on Machine Learning and Knowledge Discovery in Databases - Part II
Average-Price and Reachability-Price Games on Hybrid Automata with Strong Resets

FORMATS '08 Proceedings of the 6th international conference on Formal Modeling and Analysis of Timed Systems
Repairing Decision-Theoretic Policies Using Goal-Oriented Planning

KI '08 Proceedings of the 31st annual German conference on Advances in Artificial Intelligence
Robustness Analysis of SARSA(λ): Different Models of Reward and Initialisation

AIMSA '08 Proceedings of the 13th international conference on Artificial Intelligence: Methodology, Systems, and Applications
CCMAC: coordinated cooperative MAC for wireless LANs

Proceedings of the 11th international symposium on Modeling, analysis and simulation of wireless and mobile systems
Reasoning about actions with sensing under qualitative and probabilistic uncertainty

ACM Transactions on Computational Logic (TOCL)
An analysis of model-based Interval Estimation for Markov Decision Processes

Journal of Computer and System Sciences
An ontology-based approach for providing multimedia personalised recommendations

International Journal of Web and Grid Services
Simulation-Based Optimization Algorithms for Finite-Horizon Markov Decision Processes

Simulation
Optimism in the Face of Uncertainty Should be Refutable

Minds and Machines
Strong Probabilistic Planning

MICAI '08 Proceedings of the 7th Mexican International Conference on Artificial Intelligence: Advances in Artificial Intelligence
Reinforcement Learning for Decision Making in Sequential Visual Attention

Attention in Cognitive Systems. Theories and Systems from an Interdisciplinary Viewpoint
Efficient Reinforcement Learning in Parameterized Models: Discrete Parameter Case

Recent Advances in Reinforcement Learning
A Near Optimal Policy for Channel Allocation in Cognitive Radio

Recent Advances in Reinforcement Learning
Basis Expansion in Natural Actor Critic Methods

Recent Advances in Reinforcement Learning
Reinforcement Learning with the Use of Costly Features

Recent Advances in Reinforcement Learning
Optimistic Planning of Deterministic Systems

Recent Advances in Reinforcement Learning
Policy Iteration for Learning an Exercise Policy for American Options

Recent Advances in Reinforcement Learning
Learning and planning in environments with delayed feedback

Autonomous Agents and Multi-Agent Systems
Commitment-based service coordination

International Journal of Agent-Oriented Software Engineering
Pattern Learning and Decision Making in a Photovoltaic System

SEAL '08 Proceedings of the 7th International Conference on Simulated Evolution and Learning
Probabilistic planning with clear preferences on missing information

Artificial Intelligence
Practical solution techniques for first-order MDPs

Artificial Intelligence
Max-min optimality of service rates in queueing systems with customer-average performance criterion

Proceedings of the 40th Conference on Winter Simulation
Projected equation methods for approximate solution of large linear systems

Journal of Computational and Applied Mathematics
Coordinating randomized policies for increasing security of agent systems

Information Technology and Management
Dynamic routing policies for multiskill call centers

Probability in the Engineering and Informational Sciences
Optimal pricing and production policies of a make-to-stock system with fluctuating demand

Probability in the Engineering and Informational Sciences
Adaptive resource allocation for efficient patient scheduling

Artificial Intelligence in Medicine
Delayed Nondeterminism in Continuous-Time Markov Decision Processes

FOSSACS '09 Proceedings of the 12th International Conference on Foundations of Software Science and Computational Structures: Held as Part of the Joint European Conferences on Theory and Practice of Software, ETAPS 2009
Recent Developments in Algorithmic Teaching

LATA '09 Proceedings of the 3rd International Conference on Language and Automata Theory and Applications
A Stochastic Inventory Model with Trade Credit

Manufacturing & Service Operations Management
Using Imperfect Advance Demand Information in Production-Inventory Systems with Multiple Customer Classes

Manufacturing & Service Operations Management
A Generalized Gittins Index for a Class of Multiarmed Bandits with General Resource Requirements

Mathematics of Operations Research
Reoptimization Approaches for the Vehicle-Routing Problem with Stochastic Demands

Operations Research
Indexability and Index Heuristics for a Simple Class of Inventory Routing Problems

Operations Research
Effects of system parameters on the optimal policy structure in a class of queueing control problems

Queueing Systems: Theory and Applications
Non deterministic repairable fault trees for computing optimal repair strategy

Proceedings of the 3rd International Conference on Performance Evaluation Methodologies and Tools
Eventually-stationary policies for Markov decision models with non-constant discounting

Proceedings of the 3rd International Conference on Performance Evaluation Methodologies and Tools
2009 Special Issue: Adaptive learning via selectionism and Bayesianism, Part II: The sequential case

Neural Networks
2009 Special Issue: Coordinated machine learning and decision support for situation awareness

Neural Networks
A theoretic and practical framework for scheduling in a stochastic environment

Journal of Scheduling
The adaptive k-meteorologists problem and its application to structure learning and feature selection in reinforcement learning

ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Near-Bayesian exploration in polynomial time

ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Performance Evaluation of Direct Heuristic Dynamic Programming using Control-Theoretic Measures

Journal of Intelligent and Robotic Systems
Continuous-time Markov decision processes with nth-bias optimality criteria

Automatica (Journal of IFAC)
Policy iteration for customer-average performance optimization of closed queueing systems

Automatica (Journal of IFAC)
Improving adjustable autonomy strategies for time-critical domains

Proceedings of The 8th International Conference on Autonomous Agents and Multiagent Systems - Volume 1
Online exploration in least-squares policy iteration

Proceedings of The 8th International Conference on Autonomous Agents and Multiagent Systems - Volume 2
Scheduling policy design for autonomic systems

International Journal of Autonomous and Adaptive Communications Systems
An Inductive Logic Programming Approach to Statistical Relational Learning

Proceedings of the 2005 conference on An Inductive Logic Programming Approach to Statistical Relational Learning
Policy teaching through reward function learning

Proceedings of the 10th ACM conference on Electronic commerce
Decision with uncertainties, feasibilities, and utilities: towards a unified algebraic framework

Proceedings of the 2006 conference on ECAI 2006: 17th European Conference on Artificial Intelligence August 29 -- September 1, 2006, Riva del Garda, Italy
Approximate linear-programming algorithms for graph-based Markov decision processes

Proceedings of the 2006 conference on ECAI 2006: 17th European Conference on Artificial Intelligence August 29 -- September 1, 2006, Riva del Garda, Italy
Mean Field Approximation of the Policy Iteration Algorithm for Graph-based Markov Decision Processes

Proceedings of the 2006 conference on ECAI 2006: 17th European Conference on Artificial Intelligence August 29 -- September 1, 2006, Riva del Garda, Italy
Reinforcement Learning with the Use of Costly Features

Proceedings of the 2008 conference on ECAI 2008: 18th European Conference on Artificial Intelligence
Learning to search: Functional gradient techniques for imitation learning

Autonomous Robots
Reinforcement learning for robot soccer

Autonomous Robots
Learning teaching strategies in an Adaptive and Intelligent Educational System through Reinforcement Learning

Applied Intelligence
Reachability in Stochastic Timed Games

ICALP '09 Proceedings of the 36th Internatilonal Collogquium on Automata, Languages and Programming: Part II
A model-checking-based approach to risk analysis in supply chain consolidations

Integrated Computer-Aided Engineering - Selected papers from the IEEE Conference on Information Reuse and Integration (IRI), July 13-15, 2008
Optimal admission control policies for heterogeneous wireless networks

The Fourth International Conference on Heterogeneous Networking for Quality, Reliability, Security and Robustness & Workshops
Approximating Matches Made in Heaven

ICALP '09 Proceedings of the 36th International Colloquium on Automata, Languages and Programming: Part I
Optimal Power Management for Server Farm to Support Green Computing

CCGRID '09 Proceedings of the 2009 9th IEEE/ACM International Symposium on Cluster Computing and the Grid
Index Policies for the Admission Control and Routing of Impatient Customers to Heterogeneous Service Stations

Operations Research
Randomized shortest-path problems: Two related models

Neural Computation
An Approximate Dynamic Programming Approach to Network Revenue Management with Customer Choice

Transportation Science
Solving generalized semi-Markov decision processes using continuous phase-type distributions

AAAI'04 Proceedings of the 19th national conference on Artifical intelligence
Metrics for finite Markov decision processes

AAAI'04 Proceedings of the 19th national conference on Artifical intelligence
Learning basis functions in hybrid domains

AAAI'06 proceedings of the 21st national conference on Artificial intelligence - Volume 2
Functional value iteration for decision-theoretic planning with general utility functions

AAAI'06 proceedings of the 21st national conference on Artificial intelligence - Volume 2
Hard constrained semi-Markov decision processes

AAAI'06 Proceedings of the 21st national conference on Artificial intelligence - Volume 1
An Inductive Technique for Parameterised Model Checking of Degenerative Distributed Randomised Protocols

Electronic Notes in Theoretical Computer Science (ENTCS)
Concavely-Priced Probabilistic Timed Automata

CONCUR 2009 Proceedings of the 20th International Conference on Concurrency Theory
Stochastic Games for Verification of Probabilistic Timed Automata

FORMATS '09 Proceedings of the 7th International Conference on Formal Modeling and Analysis of Timed Systems
Hybrid least-squares algorithms for approximate policy evaluation

Machine Learning
Markov decision process applied to the control of hospital elective admissions

Artificial Intelligence in Medicine
Compositional Models for Reinforcement Learning

ECML PKDD '09 Proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases: Part I
Value functions for RL-based behavior transfer: a comparative study

AAAI'05 Proceedings of the 20th national conference on Artificial intelligence - Volume 2
Risk-sensitive planning with one-switch utility functions: value iteration

AAAI'05 Proceedings of the 20th national conference on Artificial intelligence - Volume 2
Error bounds for approximate value iteration

AAAI'05 Proceedings of the 20th national conference on Artificial intelligence - Volume 2
Planning and execution with phase transitions

AAAI'05 Proceedings of the 20th national conference on Artificial intelligence - Volume 2
Lazy approximation for solving continuous finite-horizon MDPs

AAAI'05 Proceedings of the 20th national conference on Artificial intelligence - Volume 3
Learning to prevent failure states for a dynamically balancing robot

AAAI'05 Proceedings of the 20th national conference on Artificial intelligence - Volume 3
Anytime coordination using separable bilinear programs

AAAI'07 Proceedings of the 22nd national conference on Artificial intelligence - Volume 1
Thresholded rewards: acting optimally in timed, zero-sum games

AAAI'07 Proceedings of the 22nd national conference on Artificial intelligence - Volume 2
Physical search problems applying economic search models

AAAI'08 Proceedings of the 23rd national conference on Artificial intelligence - Volume 1
Value-based policy teaching with active indirect elicitation

AAAI'08 Proceedings of the 23rd national conference on Artificial intelligence - Volume 1
Piecewise linear dynamic programming for constrained POMDPs

AAAI'08 Proceedings of the 23rd national conference on Artificial intelligence - Volume 1
Potential-based shaping in model-based reinforcement learning

AAAI'08 Proceedings of the 23rd national conference on Artificial intelligence - Volume 2
Value-function approximations for partially observable Markov decision processes

Journal of Artificial Intelligence Research
Prioritized goal decomposition of Markov decision processes: toward a synthesis of classical and decision theoretic planning

IJCAI'97 Proceedings of the Fifteenth international joint conference on Artifical intelligence - Volume 2
Efficient solution algorithms for factored MDPs

Journal of Artificial Intelligence Research
Accelerating reinforcement learning through implicit imitation

Journal of Artificial Intelligence Research
On polynomial sized MDP succinct policies

Journal of Artificial Intelligence Research
Restricted value iteration: theory and algorithms

Journal of Artificial Intelligence Research
Perseus: randomized point-based value iteration for POMDPs

Journal of Artificial Intelligence Research
Integrating learning from examples into the search for diagnostic policies

Journal of Artificial Intelligence Research
The first probabilistic track of the international planning competition

Journal of Artificial Intelligence Research
Solving factored MDPs with hybrid state and action variables

Journal of Artificial Intelligence Research
FLUCAP: a heuristic search planner for first-order MDPs

Journal of Artificial Intelligence Research
An algebraic graphical model for decision with uncertainties, feasibilities, and utilities

Journal of Artificial Intelligence Research
First order decision diagrams for relational MDPs

Journal of Artificial Intelligence Research
Communication-based decomposition mechanisms for decentralized MDPs

Journal of Artificial Intelligence Research
Optimal and approximate Q-value functions for decentralized POMDPs

Journal of Artificial Intelligence Research
Online planning algorithms for POMDPs

Journal of Artificial Intelligence Research
A heuristic search approach to planning with continuous resources in stochastic domains

Journal of Artificial Intelligence Research
Reinforcement learning: a survey

Journal of Artificial Intelligence Research
The computational complexity of probabilistic planning

Journal of Artificial Intelligence Research
Sequential optimality and coordination in multiagent systems

IJCAI'99 Proceedings of the 16th international joint conference on Artifical intelligence - Volume 1
Computing near optimal strategies for stochastic investment planning problems

IJCAI'99 Proceedings of the 16th international joint conference on Artificial intelligence - Volume 2
Variable resolution discretization for high-accuracy solutions of optimal control problems

IJCAI'99 Proceedings of the 16th international joint conference on Artificial intelligence - Volume 2
Continuous time associative bandit problems

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
First order decision diagrams for relational MDPs

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Topological value iteration algorithm for Markov decision processes

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Average-reward decentralized Markov decision processes

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Team programming in Golog under partial observability

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
State space search for risk-averse agents

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Hierarchical heuristic forward search in Stochastic domains

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
An analysis of Laplacian methods for value function approximation in MDPs

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
An MDP-based application oriented optimal policy for wireless sensor networks

CODES+ISSS '09 Proceedings of the 7th IEEE/ACM international conference on Hardware/software codesign and system synthesis
Scenario-based stochastic constraint programming

IJCAI'03 Proceedings of the 18th international joint conference on Artificial intelligence
Faster heuristic search algorithms for planning with uncertainty and full feedback

IJCAI'03 Proceedings of the 18th international joint conference on Artificial intelligence
Extending DTGOLOG with options

IJCAI'03 Proceedings of the 18th international joint conference on Artificial intelligence
Natural actor-critic algorithms

Automatica (Journal of IFAC)
Reinforcement Learning in RoboCup KeepAway with Partial Observability

WI-IAT '09 Proceedings of the 2009 IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology - Volume 02
A bilinear programming approach for multiagent planning

Journal of Artificial Intelligence Research
Complexity of probabilistic planning under average rewards

IJCAI'01 Proceedings of the 17th international joint conference on Artificial intelligence - Volume 1
Max-norm projections for factored MDPs

IJCAI'01 Proceedings of the 17th international joint conference on Artificial intelligence - Volume 1
Symbolic dynamic programming for first-order MDPs

IJCAI'01 Proceedings of the 17th international joint conference on Artificial intelligence - Volume 1
From Q(λ) to average Q-learning: efficient implementation of an asymptotic approximation

IJCAI'01 Proceedings of the 17th international joint conference on Artificial intelligence - Volume 2
Learning subjective representations for planning

IJCAI'05 Proceedings of the 19th international joint conference on Artificial intelligence
Conditional planning in the discrete belief space

IJCAI'05 Proceedings of the 19th international joint conference on Artificial intelligence
Probabilistic reasoning for plan robustness

IJCAI'05 Proceedings of the 19th international joint conference on Artificial intelligence
An MCMC approach to solving hybrid factored MDPs

IJCAI'05 Proceedings of the 19th international joint conference on Artificial intelligence
Algebraic Markov decision processes

IJCAI'05 Proceedings of the 19th international joint conference on Artificial intelligence
Process-oriented planning and average-reward optimality

IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 2
Exploiting structure in policy construction

IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 2
Planning and acting in partially observable stochastic domains

Artificial Intelligence
Utility-based on-line exploration for repeated navigation in an embedded graph

Artificial Intelligence
Intensional dynamic programming. A Rosetta stone for structured dynamic programming

Journal of Algorithms
Remote patient monitoring service using heterogeneous wireless access networks: architecture and optimization

IEEE Journal on Selected Areas in Communications - Special issue on wireless and pervasive communications for healthcare
A constrained MDP approach to dynamic quantizer design for HMM state estimation

IEEE Transactions on Signal Processing
An elective surgery scheduling problem considering patient priority

Computers and Operations Research
A Strongly Polynomial Algorithm for Controlled Queues

Mathematics of Operations Research
Optimal control of a single queue with retransmissions: delay-dropping tradeoffs

IEEE Transactions on Wireless Communications
Closed-loop cross-layer SDMA designs with outdated CSIT

IEEE Transactions on Wireless Communications
Computing equilibria in multiplayer stochastic games of imperfect information

IJCAI'09 Proceedings of the 21st international jont conference on Artifical intelligence
Bayesian real-time dynamic programming

IJCAI'09 Proceedings of the 21st international jont conference on Artifical intelligence
Generalized first order decision diagrams for first order Markov decision processes

IJCAI'09 Proceedings of the 21st international jont conference on Artifical intelligence
An RL-based scheduling algorithm for video traffic in high-rate wireless personal area networks

Computer Networks: The International Journal of Computer and Telecommunications Networking
Optimal stochastic policies for distributed data aggregation in wireless sensor networks

IEEE/ACM Transactions on Networking (TON)
Monotonic robust optimal control policies for the time-quality trade-offs in concurrent new product development (NPD)

Computers & Mathematics with Applications
Cross-layer design of FDD-OFDM systems based on ACK/NAK feedbacks

IEEE Transactions on Information Theory
Optimality of myopic sensing in multichannel opportunistic access

IEEE Transactions on Information Theory
Erlang loss bounds for OT---ICU systems

Queueing Systems: Theory and Applications
A rules-based approach for configuring chains of classifiers in real-time stream mining systems

EURASIP Journal on Advances in Signal Processing
Misplaced item search in a warehouse using an RFID-based partially observable Markov decision process (POMDP) model

CASE'09 Proceedings of the fifth annual IEEE international conference on Automation science and engineering
Optimal admission and eviction control of secondary users at cognitive radio HotSpots

SECON'09 Proceedings of the 6th Annual IEEE communications society conference on Sensor, Mesh and Ad Hoc Communications and Networks
Efficient Uncertainty Propagation for Reinforcement Learning with Limited Data

ICANN '09 Proceedings of the 19th International Conference on Artificial Neural Networks: Part I
Finding Best k Policies

ADT '09 Proceedings of the 1st International Conference on Algorithmic Decision Theory
Optimizing the Hurwicz Criterion in Decision Trees with Imprecise Probabilities

ADT '09 Proceedings of the 1st International Conference on Algorithmic Decision Theory
Quantitative Analysis under Fairness Constraints

ATVA '09 Proceedings of the 7th International Symposium on Automated Technology for Verification and Analysis
Generating Explanations Based on Markov Decision Processes

MICAI '09 Proceedings of the 8th Mexican International Conference on Artificial Intelligence
Cosine Policy Iteration for Solving Infinite-Horizon Markov Decision Processes

MICAI '09 Proceedings of the 8th Mexican International Conference on Artificial Intelligence
Navigation Method Selector for an Autonomous Explorer Rover with a Markov Decision Process

ICIRA '09 Proceedings of the 2nd International Conference on Intelligent Robotics and Applications
A mean field approach for optimization in particle systems and applications

Proceedings of the Fourth International ICST Conference on Performance Evaluation Methodologies and Tools
Myopic versus clairvoyant admission policies in wireless networks

Proceedings of the Fourth International ICST Conference on Performance Evaluation Methodologies and Tools
Multiple abstraction levels in performance analysis of WSN monitoring systems

Proceedings of the Fourth International ICST Conference on Performance Evaluation Methodologies and Tools
An adaptive opportunistic routing scheme for wireless ad-hoc networks

ISIT'09 Proceedings of the 2009 IEEE international conference on Symposium on Information Theory - Volume 4
Robust adaptive Markov decision processes in multi-vehicle applications

ACC'09 Proceedings of the 2009 conference on American Control Conference
Neural networks and Markov models for the iterated prisoner's dilemma

IJCNN'09 Proceedings of the 2009 international joint conference on Neural Networks
An alpha derivative formulation of the Hamilton-Jacobi-Bellman equation of dynamic programming

IJCNN'09 Proceedings of the 2009 international joint conference on Neural Networks
Reconfigurable disruption tolerant routing via reinforcement learning

IJCNN'09 Proceedings of the 2009 international joint conference on Neural Networks
PMaude: Rewrite-based Specification Language for Probabilistic Object Systems

Electronic Notes in Theoretical Computer Science (ENTCS)
Partial Order Reduction for Probabilistic Branching Time

Electronic Notes in Theoretical Computer Science (ENTCS)
Comparative branching-time semantics for Markov chains

Information and Computation
Bisimulation and cocongruence for probabilistic systems

Information and Computation - Special issue: Seventh workshop on coalgebraic methods in computer science 2004
Conformant plans and beyond: Principles and complexity

Artificial Intelligence
Deliberation in a metadata-based modeling and simulation environment for inter-organizational networks

Information Systems - Special issue: The 15th international conference on advanced information systems engineering (CAiSE 2003)
A markovian futures market for computing power

Proceedings of the first joint WOSP/SIPEW international conference on Performance engineering
The Lagrangian relaxation based resources allocation methods for air-to-ground operations under uncertainty circumstances

CCDC'09 Proceedings of the 21st annual international conference on Chinese control and decision conference
SI-CCMAC: sender initiating concurrent cooperative MAC for wireless LANs

WiOPT'09 Proceedings of the 7th international conference on Modeling and Optimization in Mobile, Ad Hoc, and Wireless Networks
A simple heuristic for load balancing in parallel processing networks with highly variable service time distributions

Queueing Systems: Theory and Applications
Optimal adaptive modulation and coding with switching costs

IEEE Transactions on Communications
Performance study and system optimization on sleep mode operation in IEEE 802.16e

IEEE Transactions on Wireless Communications
Joint admission control and antenna assignment for multiclass QoS in spatial multiplexing MIMO wireless networks

IEEE Transactions on Wireless Communications
Delay-sensitive distributed power and transmission threshold control for S-ALOHA network with finite state Markov fading channels

IEEE Transactions on Wireless Communications
Predictive-flow-queue-based energy optimization for gigabit ethernet controllers

IEEE Transactions on Very Large Scale Integration (VLSI) Systems
An optimal warning-zone-length assignment algorithm for real-time and multiple-QoS on-chip bus arbitration

ACM Transactions on Embedded Computing Systems (TECS)
Acceleration Operators in the Value Iteration Algorithms for Markov Decision Processes

Operations Research
Percentile Optimization for Markov Decision Processes with Parameter Uncertainty

Operations Research
Partially Observable Markov Decision Processes: A Geometric Technique and Analysis

Operations Research
A Markov Chains repurchasing model for CRM using system dynamics

MS '08 Proceedings of the 19th IASTED International Conference on Modelling and Simulation
On-Line Policy Gradient Estimation with Multi-Step Sampling

Discrete Event Dynamic Systems
Application of a seeded hybrid genetic algorithm for user interface design

SMC'09 Proceedings of the 2009 IEEE international conference on Systems, Man and Cybernetics
Rejoinder---The Languages of Stochastic Optimization

INFORMS Journal on Computing
Approximate dynamic programming techniques for the control of time-varying queuing systems applied to call centers with abandonments and retrials

Probability in the Engineering and Informational Sciences
Value iteration and action Ɛ-approximation of optimal policies in discounted Markov decision processes

MATH'09 Proceedings of the 14th WSEAS International Conference on Applied mathematics
Optimal Commodity Trading with a Capacitated Storage Asset

Management Science
Rechargeable sensor activation under temporally correlated events

Wireless Networks
An interactive framework for image annotation through gaming

Proceedings of the international conference on Multimedia information retrieval
Interactive Markov chains: and the quest for quantified quality

Interactive Markov chains: and the quest for quantified quality
CCMAC: Coordinated cooperative MAC for wireless LANs

Computer Networks: The International Journal of Computer and Telecommunications Networking
Review article: Synergizing reinforcement learning and game theory-A new direction for control

Applied Soft Computing
2010 Special Issue: Online learning of shaping rewards in reinforcement learning

Neural Networks
Transfer Learning for Reinforcement Learning Domains: A Survey

The Journal of Machine Learning Research
Reinforcement Learning in Finite MDPs: PAC Analysis

The Journal of Machine Learning Research
A Convergent Online Single Time Scale Actor Critic Algorithm

The Journal of Machine Learning Research
Game-theoretic agent programming in Golog under partial observability

KI'06 Proceedings of the 29th annual German conference on Artificial intelligence
Adaptive multi-agent programming in GTGolog

KI'06 Proceedings of the 29th annual German conference on Artificial intelligence
Rewriting logic and probabilities

RTA'03 Proceedings of the 14th international conference on Rewriting techniques and applications
Abstract interpretation of programs as Markov decision processes

SAS'03 Proceedings of the 10th international conference on Static analysis
Parallelizing parallel rollout algorithm for solving Markov decision processes

WOMPAT'03 Proceedings of the OpenMP applications and tools 2003 international conference on OpenMP shared memory parallel programming
Precise fixpoint computation through strategy iteration

ESOP'07 Proceedings of the 16th European conference on Programming
Grid brokering for batch allocation using indexes

NET-COOP'07 Proceedings of the 1st EuroFGI international conference on Network control and optimization
Pure stationary optimal strategies in Markov decision processes

STACS'07 Proceedings of the 24th annual conference on Theoretical aspects of computer science
Globally optimal user-network association in an 802.11 WLAN & 3G UMTS hybrid cell

ITC20'07 Proceedings of the 20th international teletraffic conference on Managing traffic performance in converged networks
Qualitative probabilistic modelling in event-B

IFM'07 Proceedings of the 6th international conference on Integrated formal methods
Computing and using lower and upper bounds for action elimination in MDP planning

SARA'07 Proceedings of the 7th International conference on Abstraction, reformulation, and approximation
Model-based exploration in continuous state spaces

SARA'07 Proceedings of the 7th International conference on Abstraction, reformulation, and approximation
A cost-based model and algorithms for interleaving solving and elicitation of CSPs

CP'07 Proceedings of the 13th international conference on Principles and practice of constraint programming
Capacity Rationing in Stochastic Rental Systems with Advance Demand Information

Operations Research
Dynamic Lead-Time Quotation for an M/M/1 Base-Stock Inventory Queue

Operations Research
Solving the uncertainty of vertical handovers in multi-radio home networks

Computer Communications
Analyzing the dynamics of stigmergetic interactions through pheromone games

Theoretical Computer Science
Asymptotically optimal parallel resource assignment with interference

Queueing Systems: Theory and Applications
Modeling POMDPs for generating and simulating stock investment policies

Proceedings of the 2010 ACM Symposium on Applied Computing
Skill combination for reinforcement learning

IDEAL'07 Proceedings of the 8th international conference on Intelligent data engineering and automated learning
Computing game values for crash games

ATVA'07 Proceedings of the 5th international conference on Automated technology for verification and analysis
Deciding simulations on probabilistic automata

ATVA'07 Proceedings of the 5th international conference on Automated technology for verification and analysis
Quantitative model checking revisited: neither decidable nor approximable

FORMATS'07 Proceedings of the 5th international conference on Formal modeling and analysis of timed systems
Reinforcement learning of predictive features in affordance perception

Proceedings of the 2006 international conference on Towards affordance-based robot control
Using control theory for analysis of reinforcement learning and optimal policy properties in grid-world problems

ICIC'09 Proceedings of the Intelligent computing 5th international conference on Emerging intelligent computing technology and applications
Commitment-based service coordination

SOCASE'08 Proceedings of the 2008 AAMAS international conference on Service-oriented computing: agents, semantics, and engineering
On decision problems for probabilistic Büchi automata

FOSSACS'08/ETAPS'08 Proceedings of the Theory and practice of software, 11th international conference on Foundations of software science and computational structures
Opportunistic transmission for wireless sensor networks under delay constraints

ICCSA'07 Proceedings of the 2007 international conference on Computational science and its applications - Volume Part III
Multi-channel opportunistic access: a case of restless bandits with multiple plays

Allerton'09 Proceedings of the 47th annual Allerton conference on Communication, control, and computing
REGAL: a regularization based algorithm for reinforcement learning in weakly communicating MDPs

UAI '09 Proceedings of the Twenty-Fifth Conference on Uncertainty in Artificial Intelligence
An overview of planning under uncertainty

Artificial intelligence today
Knowledge representation for stochastic decision processes

Artificial intelligence today
TTree: tree-based state generalization with temporally abstract actions

Adaptive agents and multi-agent systems
Model-based testing of object-oriented reactive systems with spec explorer

Formal methods and testing
Decision problems for Nash equilibria in stochastic games

CSL'09/EACSL'09 Proceedings of the 23rd CSL international conference and 18th EACSL Annual conference on Computer science logic
Online regret bounds for Markov decision processes with deterministic transitions

Theoretical Computer Science
All about maude - a high-performance logical framework: how to specify, program and verify systems in rewriting logic

All about maude - a high-performance logical framework: how to specify, program and verify systems in rewriting logic
Stochastic rate control for scalable VBR video streaming over wireless networks

GLOBECOM'09 Proceedings of the 28th IEEE conference on Global telecommunications
Approximating labelled Markov processes again!

CALCO'09 Proceedings of the 3rd international conference on Algebra and coalgebra in computer science
Bounds for multistage stochastic programs using supervised learning strategies

SAGA'09 Proceedings of the 5th international conference on Stochastic algorithms: foundations and applications
An Approximate Dynamic Programming Approach to Benchmark Practice-Based Heuristics for Natural Gas Storage Valuation

Operations Research
Fair Dynamic Routing in Large-Scale Heterogeneous-Server Systems

Operations Research
Automated large-scale control of gene regulatory networks

IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics
Online learning in autonomic multi-hop wireless networks for transmitting mission-critical applications

IEEE Journal on Selected Areas in Communications
Admission control for a multi-server queue with abandonment

Queueing Systems: Theory and Applications
Asymptotically optimal control of parallel tandem queues with loss

Queueing Systems: Theory and Applications
Thermal management of biosensor networks

CCNC'10 Proceedings of the 7th IEEE conference on Consumer communications and networking conference
Optimizing debt collections using constrained reinforcement learning

Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
When Promotions Meet Operations: Cross-Selling and Its Effect on Call Center Performance

Manufacturing & Service Operations Management
Function allocation for NextGen airspace via agents

Proceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems: Industry track
High-level reinforcement learning in strategy games

Proceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems: volume 1 - Volume 1
PAC-MDP learning with knowledge-based admissible models

Proceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems: volume 1 - Volume 1
Optimal policy switching algorithms for reinforcement learning

Proceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems: volume 1 - Volume 1
Planning against fictitious players in repeated normal form games

Proceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems: volume 1 - Volume 1
Cultivating desired behaviour: policy teaching via environment-dynamics tweaks

Proceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems: volume 1 - Volume 1
Approximate dynamic programming with affine ADDs

Proceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems: volume 1 - Volume 1
Energy efficient transmission strategies for body sensor networks with energy harvesting

IEEE Transactions on Communications
Intertemporal Pricing and Consumer Stockpiling

Operations Research
A Shadow Simplex Method for Infinite Linear Programs

Operations Research
A stochastic approximation method with max-norm projections and its applications to the Q-learning algorithm

ACM Transactions on Modeling and Computer Simulation (TOMACS)
Interleaving solving and elicitation of constraint satisfaction problems based on expected cost

Constraints
Risk-Based Policies for Airport Security Checkpoint Screening

Transportation Science
Review:

The Knowledge Engineering Review
When are the value iteration maximizers close to an optimal stationary policy of a discounted Markov decision process?: closing the gap between the Borel space theory and actual computations

WSEAS Transactions on Mathematics
Near-optimal Regret Bounds for Reinforcement Learning

The Journal of Machine Learning Research
Evolving Static Representations for Task Transfer

The Journal of Machine Learning Research
A customer satisfaction inventory model for supply chain integration

Expert Systems with Applications: An International Journal
Multiscale Adaptive Agent-Based Management of Storage-Enabled Photovoltaic Facilities

Proceedings of the 2010 conference on ECAI 2010: 19th European Conference on Artificial Intelligence
Uncertainty Propagation for Efficient Exploration in Reinforcement Learning

Proceedings of the 2010 conference on ECAI 2010: 19th European Conference on Artificial Intelligence
Constraint-Based Controller Synthesis in Non-Deterministic and Partially Observable Domains

Proceedings of the 2010 conference on ECAI 2010: 19th European Conference on Artificial Intelligence
On Finding Compromise Solutions in Multiobjective Markov Decision Processes

Proceedings of the 2010 conference on ECAI 2010: 19th European Conference on Artificial Intelligence
An investigation into mathematical programming for finite horizon decentralized POMDPs

Journal of Artificial Intelligence Research
Approximate robust policy iteration using multilayer perceptron neural networks for discounted infinite-horizon Markov decision processes with uncertain correlated transition matrices

IEEE Transactions on Neural Networks
Rewarding behaviors

AAAI'96 Proceedings of the thirteenth national conference on Artificial intelligence - Volume 2
Structured solution methods for non-Markovian decision processes

AAAI'97/IAAI'97 Proceedings of the fourteenth national conference on artificial intelligence and ninth conference on Innovative applications of artificial intelligence
A robust and fast action selection mechanism for planning

AAAI'97/IAAI'97 Proceedings of the fourteenth national conference on artificial intelligence and ninth conference on Innovative applications of artificial intelligence
Human-aware task planning: An application to mobile robots

ACM Transactions on Intelligent Systems and Technology (TIST)
Optimal scheduling in high-speed downlink packet access networks

ACM Transactions on Modeling and Computer Simulation (TOMACS)
Optimizing the power delivery network in dynamically voltage scaled systems with uncertain power mode transition times

Proceedings of the Conference on Design, Automation and Test in Europe
Fuzzy Markovian decision processes: Application to queueing systems

Computers & Mathematics with Applications
Exponential lower bounds for policy iteration

ICALP'10 Proceedings of the 37th international colloquium conference on Automata, languages and programming: Part II
On the characteristics of sequential decision problems and their impact on evolutionary computation and reinforcement learning

EA'09 Proceedings of the 9th international conference on Artificial evolution
A framework for verification of software with time and probabilities

FORMATS'10 Proceedings of the 8th international conference on Formal modeling and analysis of timed systems
Reducing reinforcement learning to KWIK online regression

Annals of Mathematics and Artificial Intelligence
Transmission control in cognitive radio as a Markovian dynamic game: structural result on randomized threshold policies

IEEE Transactions on Communications
An average-reward reinforcement learning algorithm for computing bias-optimal policies

AAAI'96 Proceedings of the thirteenth national conference on Artificial intelligence - Volume 1
Auto-exploratory average reward reinforcement learning

AAAI'96 Proceedings of the thirteenth national conference on Artificial intelligence - Volume 1
A study of shared-memory mutual exclusion protocols using CADP

FMICS'10 Proceedings of the 15th international conference on Formal methods for industrial critical systems
DSPM: dynamic security policy management for optimizing performance in wireless networks

MILCOM'06 Proceedings of the 2006 IEEE conference on Military communications
Eliciting Patients' Revealed Preferences: An Inverse Markov Decision Process Approach

Decision Analysis
Adaptive traffic signal control using vehicle-to-infrastructure communication: a technical note

Proceedings of the Second International Workshop on Computational Transportation Science
Ranking policies in discrete Markov decision processes

Annals of Mathematics and Artificial Intelligence
Dynamic control of a single-server system with abandonments

Queueing Systems: Theory and Applications
Optimal Breast Biopsy Decision-Making Based on Mammographic Features and Demographic Factors

Operations Research
Cost-based query answering in action probabilistic logic programs

SUM'10 Proceedings of the 4th international conference on Scalable uncertainty management
Compositional abstraction of PEPA models for transient analysis

EPEW'10 Proceedings of the 7th European performance engineering conference on Computer performance engineering
Towards analysis of semi-Markov decision processes

AICI'10 Proceedings of the 2010 international conference on Artificial intelligence and computational intelligence: Part I
On model checking techniques for randomized distributed systems

IFM'10 Proceedings of the 8th international conference on Integrated formal methods
Markov decision processes from colored Petri nets

SBIA'10 Proceedings of the 20th Brazilian conference on Advances in artificial intelligence
Symbolic bounded real-time dynamic programming

SBIA'10 Proceedings of the 20th Brazilian conference on Advances in artificial intelligence
Stochastic control via direct comparison

Discrete Event Dynamic Systems
Teaching randomized learners with feedback

Information and Computation
Internal-time temporal difference model for neural value-based decision making

Neural Computation
Sparse approximate dynamic programming for dialog management

SIGDIAL '10 Proceedings of the 11th Annual Meeting of the Special Interest Group on Discourse and Dialogue
Kalman temporal differences

Journal of Artificial Intelligence Research
A Markovian process modeling for Pickomino

CG'10 Proceedings of the 7th international conference on Computers and games
Comparing a class of dynamic model-based reinforcement learning schemes for handoff prioritization in mobile communication networks

Expert Systems with Applications: An International Journal
Speeding up learning automata based multi agent systems using the concepts of stigmergy and entropy

Expert Systems with Applications: An International Journal
Average Continuous Control of Piecewise Deterministic Markov Processes

SIAM Journal on Control and Optimization
Average Optimality in Nonhomogeneous Infinite Horizon Markov Decision Processes

Mathematics of Operations Research
Optimal channel access for TCP performance improvement in cognitive radio networks

Wireless Networks
Uncertainty-aware dynamic power management in partially observable domains

IEEE Transactions on Very Large Scale Integration (VLSI) Systems
An approach for dynamic optimization of prevention program implementation in stochastic environments

SBP'11 Proceedings of the 4th international conference on Social computing, behavioral-cultural modeling and prediction
Sample-efficient batch reinforcement learning for dialogue management optimization

ACM Transactions on Speech and Language Processing (TSLP)
Adaptive modulation with smoothed flow utility

EURASIP Journal on Wireless Communications and Networking
Time-varying management of data storage

HotDep'05 Proceedings of the First conference on Hot topics in system dependability
Synthesis for PCTL in parametric Markov decision processes

NFM'11 Proceedings of the Third international conference on NASA Formal methods
Improving strategies via SMT solving

ESOP'11/ETAPS'11 Proceedings of the 20th European conference on Programming languages and systems: part of the joint European conferences on theory and practice of software
Efficient solutions to factored MDPs with imprecise transition probabilities

Artificial Intelligence
Decentralized MDPs with sparse interactions

Artificial Intelligence
A simulation-based approximate dynamic programming approach for the control of the Intel Mini-Fab benchmark model

Winter Simulation Conference
An MDP-based admission control for a QoS-aware service-oriented system

Proceedings of the Nineteenth International Workshop on Quality of Service
A constrained MDP-based vertical handoff decision algorithm for 4G heterogeneous wireless networks

Wireless Networks
Semi-Markov Control Models with Partially Known Holding Times Distribution: Discounted and Average Criteria

Acta Applicandae Mathematicae: an international survey journal on applying mathematics and mathematical applications
Job control in heterogeneous computing systems

Journal of Computer and Systems Sciences International
Dynamic pricing and scheduling in a multi-class single-server queueing system

Queueing Systems: Theory and Applications
Dynamic resource allocation in a multi-product make-to-stock production system

Queueing Systems: Theory and Applications
Towards a real-world scenario for investigating organic computing principles in heterogeneous societies of robots

Proceedings of the 2011 workshop on Organic computing
Rapid specification and automated generation of prompting systems to assist people with dementia

Pervasive and Mobile Computing
Optimization of heuristic search using recursive algorithm selection and reinforcement learning

Annals of Mathematics and Artificial Intelligence
A dynamic programming strategy to balance exploration and exploitation in the bandit problem

Annals of Mathematics and Artificial Intelligence
Improving Gaussian process value function approximation in policy gradient algorithms

ICANN'11 Proceedings of the 21st international conference on Artificial neural networks - Volume Part II
Theoretical considerations of potential-based reward shaping for multi-agent systems

The 10th International Conference on Autonomous Agents and Multiagent Systems - Volume 1
Decision theoretic behavior composition

The 10th International Conference on Autonomous Agents and Multiagent Systems - Volume 2
Integrating reinforcement learning with human demonstrations of varying ability

The 10th International Conference on Autonomous Agents and Multiagent Systems - Volume 2
Datum-wise classification: a sequential approach to sparsity

ECML PKDD'11 Proceedings of the 2011 European conference on Machine learning and knowledge discovery in databases - Volume Part I
Sparse Kernel-SARSA(λ) with an eligibility trace

ECML PKDD'11 Proceedings of the 2011 European conference on Machine learning and knowledge discovery in databases - Volume Part III
Toward error-bounded algorithms for infinite-horizon DEC-POMDPs

The 10th International Conference on Autonomous Agents and Multiagent Systems - Volume 3
Efficient planning in R-max

The 10th International Conference on Autonomous Agents and Multiagent Systems - Volume 3
Optimal resource allocation for multiqueue systems with a shared server pool

Queueing Systems: Theory and Applications
Network Cargo Capacity Management

Operations Research
Relating average and discounted costs for quantitative analysis of timed systems

EMSOFT '11 Proceedings of the ninth ACM international conference on Embedded software
Optimal resource allocation in synchronized multi-tier Internet services

Performance Evaluation
The complexity of nash equilibria in limit-average games

CONCUR'11 Proceedings of the 22nd international conference on Concurrency theory
Constraint programming for controller synthesis

CP'11 Proceedings of the 17th international conference on Principles and practice of constraint programming
Beyond QCSP for solving control problems

CP'11 Proceedings of the 17th international conference on Principles and practice of constraint programming
Probabilistic abstractions with arbitrary domains

SAS'11 Proceedings of the 18th international conference on Static analysis
A Bayesian nonparametric approach to modeling motion patterns

Autonomous Robots
Optimal issuing of perishables with a short fixed shelf life

ICCL'11 Proceedings of the Second international conference on Computational logistics
Decision-theoretic planning with generalized first-order decision diagrams

Artificial Intelligence
On minimizing ordered weighted regrets in multiobjective Markov decision processes

ADT'11 Proceedings of the Second international conference on Algorithmic decision theory
Optimal admission control for a QoS-aware service-oriented system

ServiceWave'11 Proceedings of the 4th European conference on Towards a service-based internet
HTN-style planning in relational POMDPs using first-order FSCs

KI'11 Proceedings of the 34th Annual German conference on Advances in artificial intelligence
Information technology for healthcare transformation

IBM Journal of Research and Development
Probabilistic relational planning with first order decision diagrams

Journal of Artificial Intelligence Research
The Simplex and Policy-Iteration Methods Are Strongly Polynomial for the Markov Decision Problem with a Fixed Discount Rate

Mathematics of Operations Research
A hierarchical decomposition of decision process Petri nets for modeling complex systems

International Journal of Applied Mathematics and Computer Science
Dynamic traffic splitting to parallel wireless networks with partial information: A Bayesian approach

Performance Evaluation
A framework and a mean-field algorithm for the local control of spatial processes

International Journal of Approximate Reasoning
Algorithm portfolio selection as a bandit problem with unbounded losses

Annals of Mathematics and Artificial Intelligence
Continuous value function approximation for sequential bidding policies

UAI'99 Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence
SPUDD: stochastic planning using decision diagrams

UAI'99 Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence
Solving POMDPs by searching the space of finite policies

UAI'99 Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence
Learning finite-state controllers for partially observable environments

UAI'99 Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence
A possibilistic model for qualitative sequential decision problems under uncertainty in partially observable environments

UAI'99 Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence
Planning under continuous time and resource uncertainty: a challenge for AI

UAI'02 Proceedings of the Eighteenth conference on Uncertainty in artificial intelligence
Distributed planning in hierarchical factored MDPs

UAI'02 Proceedings of the Eighteenth conference on Uncertainty in artificial intelligence
Inductive policy selection for first-order MDPs

UAI'02 Proceedings of the Eighteenth conference on Uncertainty in artificial intelligence
Hierarchical solution of Markov decision processes using macro-actions

UAI'98 Proceedings of the Fourteenth conference on Uncertainty in artificial intelligence
Correlated action effects in decision theoretic regression

UAI'97 Proceedings of the Thirteenth conference on Uncertainty in artificial intelligence
Time-critical action: representations and application

UAI'97 Proceedings of the Thirteenth conference on Uncertainty in artificial intelligence
Learning conventions in multiagent stochastic domains using likelihood estimates

UAI'96 Proceedings of the Twelfth international conference on Uncertainty in artificial intelligence
Theory and Applications of Robust Optimization

SIAM Review
Game-theoretic reasoning about actions in nonmonotonic causal theories

LPNMR'05 Proceedings of the 8th international conference on Logic Programming and Nonmonotonic Reasoning
Game-theoretic aspects of designing hyperlink structures

WINE'06 Proceedings of the Second international conference on Internet and Network Economics
A hierarchical framework for composing nested web processes

ICSOC'06 Proceedings of the 4th international conference on Service-Oriented Computing
Optimal Energy Commitments with Storage and Intermittent Supply

Operations Research
TECHNICAL NOTE---An Optimal Policy for Joint Dynamic Price and Lead-Time Quotation

Operations Research
Partial order reduction for markov decision processes: a survey

FMCO'05 Proceedings of the 4th international conference on Formal Methods for Components and Objects
Decomposition of multi-operator queries on semiring-based graphical models

CP'06 Proceedings of the 12th international conference on Principles and Practice of Constraint Programming
Scaling model-based average-reward reinforcement learning for product delivery

ECML'06 Proceedings of the 17th European conference on Machine Learning
On reduction criteria for probabilistic reward models

FSTTCS'06 Proceedings of the 26th international conference on Foundations of Software Technology and Theoretical Computer Science
Computational methods for reachability analysis of stochastic hybrid systems

HSCC'06 Proceedings of the 9th international conference on Hybrid Systems: computation and control
Performance bounds for mobile cellular networks with handover prediction

MMNS'05 Proceedings of the 8th international conference on Management of Multimedia Networks and Services
Budgeted learning of nailve-bayes classifiers

UAI'03 Proceedings of the Nineteenth conference on Uncertainty in Artificial Intelligence
Optimal limited contingency planning

UAI'03 Proceedings of the Nineteenth conference on Uncertainty in Artificial Intelligence
Mobile agent migration: an optimal policy

AI'05 Proceedings of the 18th Australian Joint conference on Advances in Artificial Intelligence
Teaching randomized learners

COLT'06 Proceedings of the 19th annual conference on Learning Theory
A characterization of meaningful schedulers for continuous-time markov decision processes

FORMATS'06 Proceedings of the 4th international conference on Formal Modeling and Analysis of Timed Systems
Symbolic verification of communicating systems with probabilistic message losses: liveness and fairness

FORTE'06 Proceedings of the 26th IFIP WG 6.1 international conference on Formal Techniques for Networked and Distributed Systems
An optimal best-first search algorithm for solving infinite horizon DEC-POMDPs

ECML'05 Proceedings of the 16th European conference on Machine Learning
Perception-Action based object detection from local descriptor combination and reinforcement learning

SCIA'05 Proceedings of the 14th Scandinavian conference on Image Analysis
Multiobjective water pinch analysis of the cuernavaca city water distribution network

EMO'05 Proceedings of the Third international conference on Evolutionary Multi-Criterion Optimization
Perceptive evaluation for the optimal discounted reward in markov decision processes

MDAI'05 Proceedings of the Second international conference on Modeling Decisions for Artificial Intelligence
Interval-based markov decision processes for regulating interactions between two agents in multi-agent systems

PARA'04 Proceedings of the 7th international conference on Applied Parallel Computing: state of the Art in Scientific Computing
Optimality of trunk reservation for an m/m/k/n queue with several customer types and holding costs

Probability in the Engineering and Informational Sciences
Bisimulations for non-deterministic labelled markov processes

Mathematical Structures in Computer Science
The price of coordination in resource management

BPM'05 Proceedings of the 3rd international conference on Business Process Management
A policy iteration algorithm for computing fixed points in static analysis of programs

CAV'05 Proceedings of the 17th international conference on Computer Aided Verification
Proving positive almost-sure termination

RTA'05 Proceedings of the 16th international conference on Term Rewriting and Applications
Q-Learning and Enhanced Policy Iteration in Discounted Dynamic Programming

Mathematics of Operations Research
A rewriting logic sampler

ICTAC'05 Proceedings of the Second international conference on Theoretical Aspects of Computing
Proving positive almost sure termination under strategies

RTA'06 Proceedings of the 17th international conference on Term Rewriting and Applications
Play to test

FATES'05 Proceedings of the 5th international conference on Formal Approaches to Software Testing
Stochastic reasoning about channel-based component connectors

COORDINATION'06 Proceedings of the 8th international conference on Coordination Models and Languages
Abstraction and generalization in reinforcement learning: a summary and framework

ALA'09 Proceedings of the Second international conference on Adaptive and Learning Agents
Probabilistic verification of uncertain systems using bounded-parameter markov decision processes

MDAI'06 Proceedings of the Third international conference on Modeling Decisions for Artificial Intelligence
An afterstates reinforcement learning approach to optimize admission control in mobile cellular networks

EURO-NGI'05 Proceedings of the Second international conference on Wireless Systems and Network Architectures in Next Generation Internet
Model-Checking markov chains in the presence of uncertainties

TACAS'06 Proceedings of the 12th international conference on Tools and Algorithms for the Construction and Analysis of Systems
Adaptive opportunistic routing for wireless ad hoc networks

IEEE/ACM Transactions on Networking (TON)
Admission control policies for a multi-class QoS-aware service oriented architecture

ACM SIGMETRICS Performance Evaluation Review
Basis function discovery using spectral clustering and bisimulation metrics

ALA'11 Proceedings of the 11th international conference on Adaptive and Learning Agents
Timed automata approach to verification of systems with degradation

MEMICS'11 Proceedings of the 7th international conference on Mathematical and Engineering Methods in Computer Science
Recent thermal management techniques for microprocessors

ACM Computing Surveys (CSUR)
Model-checking and simulation for stochastic timed systems

FMCO'10 Proceedings of the 9th international conference on Formal Methods for Components and Objects
Synthesizing efficient controllers

VMCAI'12 Proceedings of the 13th international conference on Verification, Model Checking, and Abstract Interpretation
Towards a bridge between cost and wealth in risk-aware planning

Applied Intelligence
Learning to make predictions in partially observable environments without a generative model

Journal of Artificial Intelligence Research
Stochastic enforced hill-climbing

Journal of Artificial Intelligence Research
Dynamic production control in parallel processing systems under process queue time constraints

Computers and Industrial Engineering
Approximate planning and verification for large markov decision processes

Proceedings of the 27th Annual ACM Symposium on Applied Computing
Technical Communique: A unified approach to Markov decision problems and performance sensitivity analysis

Automatica (Journal of IFAC)
An empirical methodology for human integration in the SE technical processes

Systems Engineering
Eliciting additive reward functions for Markov decision processes

IJCAI'11 Proceedings of the Twenty-Second international joint conference on Artificial Intelligence - Volume Volume Three
Robust online optimization of reward-uncertain MDPs

IJCAI'11 Proceedings of the Twenty-Second international joint conference on Artificial Intelligence - Volume Volume Three
NP-Hardness of checking the unichain condition in average cost MDPs

Operations Research Letters
Time aggregated Markov decision processes via standard dynamic programming

Operations Research Letters
A policy improvement method for constrained average Markov decision processes

Operations Research Letters
Characterizing extreme points as basic feasible solutions in infinite linear programs

Operations Research Letters
Markov decision processes with exponentially representable discounting

Operations Research Letters
Fast convergence to state-action frequency polytopes for MDPs

Operations Research Letters
Structured replacement policies for a Markov-modulated shock model

Operations Research Letters
On the optimal control of a two-queue polling model

Operations Research Letters
Optimal control of a production-inventory system with customer impatience

Operations Research Letters
Monotone optimal replacement policies for a Markovian deteriorating system in a controllable environment

Operations Research Letters
A note on negative dynamic programming for risk-sensitive control

Operations Research Letters
The value iteration method for countable state Markov decision processes

Operations Research Letters
Optimal scheduling in multi-server queues with random connectivity and retransmissions

Computer Communications
Human-aware planning for robots embedded in ambient ecologies

Pervasive and Mobile Computing
DoMAIns: Domain-based modeling for Ambient Intelligence

Pervasive and Mobile Computing
Cross-Layer Power Allocation for Packet Transmission Over Fading Channel

Wireless Personal Communications: An International Journal
Robustness of optimal channel reservation using handover prediction in multiservice wireless networks

Wireless Networks
Rewards for pairs of Q-learning agents conducive to turn-taking in medium-access games

Adaptive Behavior - Animals, Animats, Software Agents, Robots, Adaptive Systems
Singularly Perturbed Discounted Markov Control Processes in a General State Space

SIAM Journal on Control and Optimization
Bisimulation Metrics for Continuous Markov Decision Processes

SIAM Journal on Computing
An Iterative Procedure for Constructing Subsolutions of Discrete-Time Optimal Control Problems

SIAM Journal on Control and Optimization
Gradient based algorithms with loss functions and kernels for improved on-policy control

EWRL'11 Proceedings of the 9th European conference on Recent Advances in Reinforcement Learning
Active learning of MDP models

EWRL'11 Proceedings of the 9th European conference on Recent Advances in Reinforcement Learning
MapReduce for parallel reinforcement learning

EWRL'11 Proceedings of the 9th European conference on Recent Advances in Reinforcement Learning
Dynamic potential-based reward shaping

Proceedings of the 11th International Conference on Autonomous Agents and Multiagent Systems - Volume 1
Analysis of methods for solving MDPs

Proceedings of the 11th International Conference on Autonomous Agents and Multiagent Systems - Volume 3
Tax Collections Optimization for New York State

Interfaces
Power efficient scheduling over fading channel for cross-layer optimization

Wireless Communications & Mobile Computing
Finding patterns in an unknown graph

AI Communications - The Symposium on Combinatorial Search
Multi-rate control policies for elastic traffic in CDMA networks

Performance Evaluation
A rapid sparsification method for kernel machines in approximate policy iteration

ISNN'12 Proceedings of the 9th international conference on Advances in Neural Networks - Volume Part I
Solving an Infinite Horizon Adverse Selection Model Through Finite Policy Graphs

Operations Research
Optimization of China's strategic petroleum reserve policy: A Markovian decision approach

Computers and Industrial Engineering
On the Computational Complexity of Stochastic Controller Optimization in POMDPs

ACM Transactions on Computation Theory (TOCT)
Mobile cloud computing: A survey

Future Generation Computer Systems
Bisimulation and logical preservation for continuous-time markov decision processes

CONCUR'07 Proceedings of the 18th international conference on Concurrency Theory
Precise relational invariants through strategy iteration

CSL'07/EACSL'07 Proceedings of the 21st international conference, and Proceedings of the 16th annuall conference on Computer Science Logic
Reachability-time games on timed automata

ICALP'07 Proceedings of the 34th international conference on Automata, Languages and Programming
GPU based generation of state transition models using simulations for unmanned surface vehicle trajectory planning

Robotics and Autonomous Systems
The Effect of Budgetary Restrictions on Breast Cancer Diagnostic Decisions

Manufacturing & Service Operations Management
Approximate verification and enumeration problems

ICTAC'12 Proceedings of the 9th international conference on Theoretical Aspects of Computing
Playing stochastic games precisely

CONCUR'12 Proceedings of the 23rd international conference on Concurrency Theory
Adaptive planning for markov decision processes with uncertain transition models via incremental feature dependency discovery

ECML PKDD'12 Proceedings of the 2012 European conference on Machine Learning and Knowledge Discovery in Databases - Volume Part II
Learning policies for battery usage optimization in electric vehicles

ECML PKDD'12 Proceedings of the 2012 European conference on Machine Learning and Knowledge Discovery in Databases - Volume Part II
QAVA: quota aware video adaptation

Proceedings of the 8th international conference on Emerging networking experiments and technologies
Pareto curves for probabilistic model checking

ATVA'12 Proceedings of the 10th international conference on Automated Technology for Verification and Analysis
Modelling and decentralised runtime control of self-stabilising power micro grids

ISoLA'12 Proceedings of the 5th international conference on Leveraging Applications of Formal Methods, Verification and Validation: technologies for mastering change - Volume Part I
Optimal time-abstract schedulers for CTMDPs and continuous-time Markov games

Theoretical Computer Science
An efficient algorithm for stochastic capacity portfolio planning problems

Journal of Intelligent Manufacturing
Optimal storage policies with wind forecast uncertainties

ACM SIGMETRICS Performance Evaluation Review
Recognizing internal states of other agents to anticipate and coordinate interactions

EUMAS'11 Proceedings of the 9th European conference on Multi-Agent Systems
Periodic capacity management under a lead-time performance constraint

OR Spectrum
Comparison of ambulance diversion policies via simulation

Proceedings of the Winter Simulation Conference
Simulation to discover structure in optimal dynamic control policies

Proceedings of the Winter Simulation Conference
Optimization model selection for simulation-based approximate dynamic programming approaches in semiconductor manufacturing operations

Proceedings of the Winter Simulation Conference
Optimal batch process admission control in tandem queueing systems with queue time constraint considerations

Proceedings of the Winter Simulation Conference
Using approximate dynamic programming to optimize admission control in cloud computing environment

Proceedings of the Winter Simulation Conference
A sampled fictitious play based learning algorithm for infinite horizon Markov decision processes

Proceedings of the Winter Simulation Conference
Performance Guarantees for Empirical Markov Decision Processes with Applications to Multiperiod Inventory Models

Operations Research
Autonomic QoS Optimization of Real-Time Internet Audio Using Loss Prediction and Stochastic Control

International Journal of Adaptive, Resilient and Autonomic Systems
On-Line model-based continuous state reinforcement learning using background knowledge

AI'12 Proceedings of the 25th Australasian joint conference on Advances in Artificial Intelligence
Decision support for safe AI design

AGI'12 Proceedings of the 5th international conference on Artificial General Intelligence
A survey of point-based POMDP solvers

Autonomous Agents and Multi-Agent Systems
The duality of state and observation in probabilistic transition systems

TbiLLC'11 Proceedings of the 9th international conference on Logic, Language, and Computation
Green-Waved Cooperative Coordination Algorithm for Decentralized Traffic Control

WI-IAT '12 Proceedings of the The 2012 IEEE/WIC/ACM International Joint Conferences on Web Intelligence and Intelligent Agent Technology - Volume 02
Humans-Robots Sliding Collaboration Control in Complex Environments with Adjustable Autonomy

WI-IAT '12 Proceedings of the The 2012 IEEE/WIC/ACM International Joint Conferences on Web Intelligence and Intelligent Agent Technology - Volume 02
Performance of distributed multi-agent multi-state reinforcement spectrum management using different exploration schemes

Expert Systems with Applications: An International Journal
Parallel Abductive Query Answering in Probabilistic Logic Programs

ACM Transactions on Computational Logic (TOCL)
A comparison of decision-maker perspectives for optimal cholesterol treatment

IBM Journal of Research and Development
Shortest stochastic path with risk sensitive evaluation

MICAI'12 Proceedings of the 11th Mexican international conference on Advances in Artificial Intelligence - Volume Part I
Forward and backward feature selection in gradient-based MDP algorithms

MICAI'12 Proceedings of the 11th Mexican international conference on Advances in Artificial Intelligence - Volume Part I
Two-sided matching with partial information

Proceedings of the fourteenth ACM conference on Electronic commerce
A Dispatching Model for Server-to-Customer Systems That Balances Efficiency and Equity

Manufacturing & Service Operations Management
Model checking and performance evaluation with CADP illustrated on shared-memory mutual exclusion protocols

Science of Computer Programming
Optimal H.264 scalable video scheduling policies for 3G/4G wireless cellular and video sensor networks

Advances in Multimedia
Managing Inventory in Global Supply Chains Facing Port-of-Entry Disruption Risks

Transportation Science
Speeding-up reinforcement learning through abstraction and transfer learning

Proceedings of the 2013 international conference on Autonomous agents and multi-agent systems
Finding objects through stochastic shortest path problems

Proceedings of the 2013 international conference on Autonomous agents and multi-agent systems
Cooperating with a markovian ad hoc teammate

Proceedings of the 2013 international conference on Autonomous agents and multi-agent systems
A learning agent for heat-pump thermostat control

Proceedings of the 2013 international conference on Autonomous agents and multi-agent systems
Modeling non-stationary opponents

Proceedings of the 2013 international conference on Autonomous agents and multi-agent systems
Learning in non-stationary MDPs as transfer learning

Proceedings of the 2013 international conference on Autonomous agents and multi-agent systems
On Boundedness of Q-Learning Iterates for Stochastic Shortest Path Problems

Mathematics of Operations Research
Model selection in markovian processes

Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining
Modeling and probabilistic reasoning of population evacuation during large-scale disaster

Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining
Solution time reduction techniques of a stochastic dynamic programming approach for hazardous material route selection problem

Computers and Industrial Engineering
Reinforcement learning for cooperative sensing gain in cognitive radio ad hoc networks

Wireless Networks
Minimax PAC bounds on the sample complexity of reinforcement learning with a generative model

Machine Learning
Learning policies for battery usage optimization in electric vehicles

Machine Learning
Robust Modified Policy Iteration

INFORMS Journal on Computing
Approximate Linear Programming for Average Cost MDPs

Mathematics of Operations Research
Incremental clustering and expansion for faster optimal planning in decentralized POMDPs

Journal of Artificial Intelligence Research
A novel reinforcement learning architecture for continuous state and action spaces

Advances in Artificial Intelligence
Collaborative intelligence in knowledge based service planning

Expert Systems with Applications: An International Journal
The right timing: reflections on the modeling and analysis of time

PETRI NETS'13 Proceedings of the 34th international conference on Application and Theory of Petri Nets and Concurrency
Polynomial-Time verification of PCTL properties of MDPs with convex uncertainties

CAV'13 Proceedings of the 25th international conference on Computer Aided Verification
Optimal Policies for Reducing Unnecessary Follow-Up Mammography Exams in Breast Cancer Diagnosis

Decision Analysis
A novel modular Q-learning architecture to improve performance under incomplete learning in a grid soccer game

Engineering Applications of Artificial Intelligence
A dynamic programming approximation for downlink channel allocation in cognitive femtocell networks

Computer Networks: The International Journal of Computer and Telecommunications Networking
Reinforcement learning in robotics: A survey

International Journal of Robotics Research
The steady-state control problem for markov decision processes

QEST'13 Proceedings of the 10th international conference on Quantitative Evaluation of Systems
Improving Energy Efficiency in Web Services: An Agent-Based Approach for Service Selection and Dynamic Speed Scaling

International Journal of Web Services Research
Lookahead actions in dispatching to parallel queues

Performance Evaluation
A simple index rule for efficient traffic splitting over parallel wireless networks with partial information

Performance Evaluation
Curtailing Intermittent Generation in Electrical Systems

Manufacturing & Service Operations Management
Models of gaze control for manipulation tasks

ACM Transactions on Applied Perception (TAP)
Intention-aware routing to minimise delays at electric vehicle charging stations

IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence
Efficient learning in linearly solvable MDP models

IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence
Sufficient plan-time statistics for decentralized POMDPs

IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence
Bayesian nonparametric feature construction for inverse reinforcement learning

IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence
Prior-free exploration bonus for and beyond near bayes-optimal behavior

IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence
Fault-tolerant planning under uncertainty

IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence
Interactive value iteration for Markov decision processes with unknown rewards

IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence
Robust optimization for hybrid MDPs with state-dependent noise

IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence
WrightEagle and UT Austin villa: RoboCup 2011 simulation league champions

Robot Soccer World Cup XV
Technical communique: Policy set iteration for Markov decision processes

Automatica (Journal of IFAC)
Compositional probabilistic verification through multi-objective model checking

Information and Computation
Reinforcement learning-based design of sampling policies under cost constraints in Markov random fields: Application to weed map reconstruction

Computational Statistics & Data Analysis
Approximation Metrics Based on Probabilistic Bisimulations for General State-Space Markov Processes: A Survey

Electronic Notes in Theoretical Computer Science (ENTCS)
Scheduling sensors for monitoring sentient spaces using an approximate POMDP policy

Pervasive and Mobile Computing
Operational versus weakest pre-expectation semantics for the probabilistic guarded command language

Performance Evaluation
Multiagent learning in the presence of memory-bounded agents

Autonomous Agents and Multi-Agent Systems
A hybrid (N/M)CHO soft/hard vertical handover technique for heterogeneous wireless networks

Ad Hoc Networks
Map partitioning to approximate an exploration strategy in mobile robotics

Multiagent and Grid Systems
A survey of multi-objective sequential decision-making

Journal of Artificial Intelligence Research
Point-based online value iteration algorithm in large POMDP

Applied Intelligence

Quantified Score

Hi-index	0.15

Visualization

Abstract

From the Publisher:The past decade has seen considerable theoretical and applied research on Markov decision processes, as well as the growing use of these models in ecology, economics, communications engineering, and other fields where outcomes are uncertain and sequential decision-making processes are needed. A timely response to this increased activity, Martin L. Puterman's new work provides a uniquely up-to-date, unified, and rigorous treatment of the theoretical, computational, and applied research on Markov decision process models. It discusses all major research directions in the field, highlights many significant applications of Markov decision processes models, and explores numerous important topics that have previously been neglected or given cursory coverage in the literature. Markov Decision Processes focuses primarily on infinite horizon discrete time models and models with discrete time spaces while also examining models with arbitrary state spaces, finite horizon models, and continuous-time discrete state models. The book is organized around optimality criteria, using a common framework centered on the optimality (Bellman) equation for presenting results. The results are presented in a "theorem-proof" format and elaborated on through both discussion and examples, including results that are not available in any other book. A two-state Markov decision process model, presented in Chapter 3, is analyzed repeatedly throughout the book and demonstrates many results and algorithms. Markov Decision Processes covers recent research advances in such areas as countable state space models with average reward criterion, constrained models, and models with risk sensitive optimality criteria. It also explores several topics that have received little or no attention in other books, including modified policy iteration, multichain models with average reward criterion, and sensitive optimality. In addition, a Bibliographic Remarks section in each chapter comments on relevant historic