Mean-field theory for batched TD (&lgr;)
Neural Computation
Learning agents for uncertain environments (extended abstract)
COLT' 98 Proceedings of the eleventh annual conference on Computational learning theory
Solving very large weakly coupled Markov decision processes
AAAI '98/IAAI '98 Proceedings of the fifteenth national/tenth conference on Artificial intelligence/Innovative applications of artificial intelligence
Incremental Gradient Algorithms with Stepsizes Bounded Away from Zero
Computational Optimization and Applications
Machine Learning
Elevator Group Control Using Multiple Reinforcement Learning Agents
Machine Learning
Learning Team Strategies: Soccer Case Studies
Machine Learning
Reinforcement learning and mistake bounded algorithms
COLT '99 Proceedings of the twelfth annual conference on Computational learning theory
Convergence analysis of temporal-difference learning algorithms with linear function approximation
COLT '99 Proceedings of the twelfth annual conference on Computational learning theory
Toward a Model of Intelligence as an Economy of Agents
Machine Learning
Congestion-dependent pricing of network services
IEEE/ACM Transactions on Networking (TON)
Learning to Play Chess Using Temporal Differences
Machine Learning
Improved results for route planning in stochastic transportation
SODA '01 Proceedings of the twelfth annual ACM-SIAM symposium on Discrete algorithms
Computational challenges in portfolio management
Computing in Science and Engineering
Proceedings of the 33nd conference on Winter simulation
Proceedings of the 33nd conference on Winter simulation
Proceedings of the twenty-first annual symposium on Principles of distributed computing
Optimal resource allocation in multi-class networks with user-specified utility functions
Computer Networks: The International Journal of Computer and Telecommunications Networking
A Hybrid Architecture for Situated Learning of Reactive Sequential Decision Making
Applied Intelligence
Planning and Control in Artificial Intelligence: A Unifying Perspective
Applied Intelligence
A Model of Partially Observable State Game and its Optimality
Applied Intelligence
Reinforcement Learning Soccer Teams with Incomplete World Models
Autonomous Robots
The Relations Among Potentials, Perturbation Analysis,and Markov Decision Processes
Discrete Event Dynamic Systems
Dynamics of Transmission Provision in a Competitive PowerIndustry
Discrete Event Dynamic Systems
Rollout Algorithms for Combinatorial Optimization
Journal of Heuristics
Rollout Algorithms for Stochastic Scheduling Problems
Journal of Heuristics
On the value function of a priority queue with an application to a controlled polling model
Queueing Systems: Theory and Applications
Machine Learning
Near-Optimal Reinforcement Learning in Polynomial Time
Machine Learning
Technical Update: Least-Squares Temporal Difference Learning
Machine Learning
Risk-Sensitive Reinforcement Learning
Machine Learning
Variable Resolution Discretization in Optimal Control
Machine Learning
Linear waste of best fit bin packing on skewed distributions
Random Structures & Algorithms - Probabilistic methods in combinatorial optimization
Dopamine: generalization and bonuses
Neural Networks - Computational models of neuromodulation
Opponent interactions between serotonin and dopamine
Neural Networks - Computational models of neuromodulation
From Perturbation Analysis to Markov Decision Processes and Reinforcement Learning
Discrete Event Dynamic Systems
Recent Advances in Hierarchical Reinforcement Learning
Discrete Event Dynamic Systems
Least Squares Policy Evaluation Algorithms with Linear Function Approximation
Discrete Event Dynamic Systems
A Topic-Specific Web Robot Model Based on Restless Bandits
IEEE Internet Computing
Sequence Learning: From Recognition and Prediction to Sequential Decision Making
IEEE Intelligent Systems
Optimal control using the transport equation: the Liouville machine
Adaptive Behavior
Learning of plan execution policies for indoor navigation
AI Communications - Special issue on KI-2001
Optimizing hypervideo navigation using a Markov decision process approach
Proceedings of the tenth ACM international conference on Multimedia
Machines that learn to play games
EMCL '01 Proceedings of the 12th European Conference on Machine Learning
Learning While Exploring: Bridging the Gaps in the Eligibility Traces
EMCL '01 Proceedings of the 12th European Conference on Machine Learning
Propagation of Q-values in Tabular TD(lambda)
ECML '02 Proceedings of the 13th European Conference on Machine Learning
Q-Cut - Dynamic Discovery of Sub-goals in Reinforcement Learning
ECML '02 Proceedings of the 13th European Conference on Machine Learning
Characterizing Markov Decision Processes
ECML '02 Proceedings of the 13th European Conference on Machine Learning
ICCS '01 Proceedings of the International Conference on Computational Science-Part II
A Framework for Supporting Intelligent Fault and Performance Management for Communication Networks
MMNS '01 Proceedings of the 4th IFIP/IEEE International Conference on Management of Multimedia Networks and Services: Management of Multimedia on the Internet
Minimizing Transmission Costs through Adaptive Marking in Differentiated Services Networks
MMNS '02 Proceedings of the 5th IFIP/IEEE International Conference on Management of Multimedia Networks and Services: Management of Multimedia on the Internet
An Overview of MAXQ Hierarchical Reinforcement Learning
SARA '02 Proceedings of the 4th International Symposium on Abstraction, Reformulation, and Approximation
Reinforcement Learning: Past, Present and Future
SEAL'98 Selected papers from the Second Asia-Pacific Conference on Simulated Evolution and Learning on Simulated Evolution and Learning
Least-Squares Methods in Reinforcement Learning for Control
SETN '02 Proceedings of the Second Hellenic Conference on AI: Methods and Applications of Artificial Intelligence
Modelling Intelligent Behaviour: The Markov Decision Process Approach
IBERAMIA '98 Proceedings of the 6th Ibero-American Conference on AI: Progress in Artificial Intelligence
An Analysis of the Pheromone Q-Learning Algorithm
IBERAMIA 2002 Proceedings of the 8th Ibero-American Conference on AI: Advances in Artificial Intelligence
Hybrid Framework for Neuro-Dynamic Programming Application to Water Supply Networks
IWANN '01 Proceedings of the 6th International Work-Conference on Artificial and Natural Neural Networks: Bio-inspired Applications of Connectionism-Part II
Rationality Assumptions and Optimality of Co-learning
PRIMA '00 Proceedings of the Third Pacific Rim International Workshop on Multi-Agents: Design and Applications of Intelligent Agents
Karlsruhe Brainstormers - A Reinforcement Learning Approach to Robotic Soccer
RoboCup 2001: Robot Soccer World Cup V
Different Local Search Algorithms in STAGE for Solving Bin Packing Problem
EurAsia-ICT '02 Proceedings of the First EurAsian Conference on Information and Communication Technology
Learning to Behave by Environment Reinforcement
RoboCup-99: Robot Soccer World Cup III
Computing Minimum and Maximum Reachability Times in Probabilistic Systems
CONCUR '99 Proceedings of the 10th International Conference on Concurrency Theory
An Improved Q-Learning Algorithm Using Synthetic Pheromones
CEEMAS '01 Revised Papers from the Second International Workshop of Central and Eastern Europe on Multi-Agent Systems: From Theory to Practice in Multi-Agent Systems
Distributed Learning and Control for Manufacturing Systems Scheduling
Proceedings of the 14th International conference on Industrial and engineering applications of artificial intelligence and expert systems: engineering of intelligent systems
On the Asymptotic Behaviour of a Constant Stepsize Temporal-Difference Learning Algorithm
EuroCOLT '99 Proceedings of the 4th European Conference on Computational Learning Theory
Open Theoretical Questions in Reinforcement Learning
EuroCOLT '99 Proceedings of the 4th European Conference on Computational Learning Theory
Introduction to Sequence Learning
Sequence Learning - Paradigms, Algorithms, and Applications
Automatic Segmentation of Sequences through Hierarchical Reinforcement Learning
Sequence Learning - Paradigms, Algorithms, and Applications
Sequential Decision Making Based on Direct Search
Sequence Learning - Paradigms, Algorithms, and Applications
Towards Stochastic Constraint Programming: A Study of Online Multi-choice Knapsack with Deadlines
CP '01 Proceedings of the 7th International Conference on Principles and Practice of Constraint Programming
Restart Policies with Dependence among Runs: A Dynamic Programming Approach
CP '02 Proceedings of the 8th International Conference on Principles and Practice of Constraint Programming
Logic, Knowledge Representation, and Bayesian Decision Theory
CL '00 Proceedings of the First International Conference on Computational Logic
Dynamic Pricing of Information Products Based on Reinforcement Learning: A Yield-Management Approach
KI '02 Proceedings of the 25th Annual German Conference on AI: Advances in Artificial Intelligence
Feedforward Neural Networks in Reinforcement Learning Applied to High-Dimensional Motor Control
ALT '02 Proceedings of the 13th International Conference on Algorithmic Learning Theory
To Collect or Not to Collect? Machine Learning for Memory Management
Proceedings of the 2nd Java Virtual Machine Research and Technology Symposium
COLT '01/EuroCOLT '01 Proceedings of the 14th Annual Conference on Computational Learning Theory and and 5th European Conference on Computational Learning Theory
PAC Bounds for Multi-armed Bandit and Markov Decision Processes
COLT '02 Proceedings of the 15th Annual Conference on Computational Learning Theory
Using Rollouts to Induce a Policy from a User Model
UM '01 Proceedings of the 8th International Conference on User Modeling 2001
Adaptive Strategies and Regret Minimization in Arbitrarily Varying Markov Environments
COLT '01/EuroCOLT '01 Proceedings of the 14th Annual Conference on Computational Learning Theory and and 5th European Conference on Computational Learning Theory
Memetic-neural scheduler of jobs in identical parallel machines
Second international workshop on Intelligent systems design and application
Sequential cost-sensitive decision making with reinforcement learning
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Polynomial-time reinforcement learning of near-optimal policies
Eighteenth national conference on Artificial intelligence
Greedy linear value-approximation for factored Markov decision processes
Eighteenth national conference on Artificial intelligence
Piecewise linear value function approximation for factored MDPs
Eighteenth national conference on Artificial intelligence
Multi-agent learning in extensive games with complete information
AAMAS '03 Proceedings of the second international joint conference on Autonomous agents and multiagent systems
An introduction to reinforcement learning theory: value function methods
Advanced lectures on machine learning
Solving factored MDPs using non-homogeneous partitions
Artificial Intelligence - special issue on planning with uncertainty and incomplete information
The empirical Bayes envelope and regret minimization in competitive Markov decision processes
Mathematics of Operations Research
Recent Advances in Hierarchical Reinforcement Learning
Discrete Event Dynamic Systems
On the convergence of optimistic policy iteration
The Journal of Machine Learning Research
Lyapunov design for safe reinforcement learning
The Journal of Machine Learning Research
Adaptive Radial Basis Decomposition by Learning Vector Quantization
Neural Processing Letters
On finding global optima for the hinge fitting problem
Computers and Operations Research
Computer Networks: The International Journal of Computer and Telecommunications Networking
Least-squares policy iteration
The Journal of Machine Learning Research
Combining importance sampling and temporal difference control variates to simulate Markov Chains
ACM Transactions on Modeling and Computer Simulation (TOMACS)
CONVERGENCE OF SIMULATION-BASED POLICY ITERATION
Probability in the Engineering and Informational Sciences
A LEARNING ALGORITHM FOR DISCRETE-TIME STOCHASTIC CONTROL
Probability in the Engineering and Informational Sciences
Parallel Rollout for Online Solution of Partially Observable Markov Decision Processes
Discrete Event Dynamic Systems
Online pricing for bandwidth provisioning in multi-class networks
Computer Networks: The International Journal of Computer and Telecommunications Networking
The Journal of Machine Learning Research
A Geometric Approach to Multi-Criterion Reinforcement Learning
The Journal of Machine Learning Research
A generic architecture for adaptive agents based on reinforcement learning
Information Sciences—Informatics and Computer Science: An International Journal - Special issue: Bio-inspired systems (BIS)
An Explicit Solution for the Value Function of a Priority Queue
Queueing Systems: Theory and Applications
Dynamic abstraction in reinforcement learning via clustering
ICML '04 Proceedings of the twenty-first international conference on Machine learning
Convergence of synchronous reinforcement learning with linear function approximation
ICML '04 Proceedings of the twenty-first international conference on Machine learning
Modeling correlations in web traces and implications for designing replacement policies
Computer Networks: The International Journal of Computer and Telecommunications Networking
Reinforcement Learning with Factored States and Actions
The Journal of Machine Learning Research
Integrating Guidance into Relational Reinforcement Learning
Machine Learning
A Dynamic Pricing Mechanisms for P2P Referral Systems
AAMAS '04 Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems - Volume 3
Planning and programming with first-order markov decision processes: insights and challenges
TARK '01 Proceedings of the 8th conference on Theoretical aspects of rationality and knowledge
New simulation methodology for finance: duality theory and simulation in financial engineering
Proceedings of the 35th conference on Winter simulation: driving innovation
Routing Of Airplanes To Two Runways: Monotonicity Of Optimal Controls
Probability in the Engineering and Informational Sciences
BROADNETS '04 Proceedings of the First International Conference on Broadband Networks
A Stochastic Control Model for Deployment of Dynamic Grid Services
GRID '04 Proceedings of the 5th IEEE/ACM International Workshop on Grid Computing
Solving factored MDPs with continuous and discrete variables
UAI '04 Proceedings of the 20th conference on Uncertainty in artificial intelligence
Coordinating Multiple Agents via Reinforcement Learning
Autonomous Agents and Multi-Agent Systems
System for foreign exchange trading using genetic algorithms and reinforcement learning
International Journal of Systems Science
Online convex optimization in the bandit setting: gradient descent without a gradient
SODA '05 Proceedings of the sixteenth annual ACM-SIAM symposium on Discrete algorithms
Parallelization Strategies for Rollout Algorithms
Computational Optimization and Applications
Encyclopedia of Computer Science
Computational intelligence for structured learning of a partner robot based on imitation
Information Sciences—Informatics and Computer Science: An International Journal - Special issue: Intelligent embedded agents
Dynamical Mobile Terminal Location Registration in Wireless PCS Networks
IEEE Transactions on Mobile Computing
Optimal Control Using the Transport Equation: The Liouville Machine
Adaptive Behavior - Animals, Animats, Software Agents, Robots, Adaptive Systems
Exploration and apprenticeship learning in reinforcement learning
ICML '05 Proceedings of the 22nd international conference on Machine learning
Reinforcement learning with Gaussian processes
ICML '05 Proceedings of the 22nd international conference on Machine learning
Interactive learning of mappings from visual percepts to actions
ICML '05 Proceedings of the 22nd international conference on Machine learning
Relating reinforcement learning performance to classification performance
ICML '05 Proceedings of the 22nd international conference on Machine learning
Proto-value functions: developmental reinforcement learning
ICML '05 Proceedings of the 22nd international conference on Machine learning
Bounded real-time dynamic programming: RTDP with monotone upper bounds and performance guarantees
ICML '05 Proceedings of the 22nd international conference on Machine learning
Finite time bounds for sampling based fitted value iteration
ICML '05 Proceedings of the 22nd international conference on Machine learning
Bayesian sparse sampling for on-line reward optimization
ICML '05 Proceedings of the 22nd international conference on Machine learning
A Reinforcement Learning Algorithm in Cooperative Multi-Robot Domains
Journal of Intelligent and Robotic Systems
An analytic modelling approach for network routing algorithms that use "ant-like" mobile agents
Computer Networks: The International Journal of Computer and Telecommunications Networking
Minds and Machines - Machine learning as experimental philosophy of science
A Unified Analysis of Value-Function-Based Reinforcement Learning Algorithms
Neural Computation
A Generalized Kalman Filter for Fixed Point Approximation and Efficient Temporal-Difference Learning
Discrete Event Dynamic Systems
Using inaccurate models in reinforcement learning
ICML '06 Proceedings of the 23rd international conference on Machine learning
Approximate Policy Optimization and Adaptive Control in Regression Models
Computational Economics
Mobile Networks and Applications - Special issue: Recent advances in wireless networking
Function-approximation-based importance sampling for pricing American options
WSC '04 Proceedings of the 36th conference on Winter simulation
The optimizing-simulator: merging simulation and optimization using approximate dynamic programming
WSC '05 Proceedings of the 37th conference on Winter simulation
Approximate dynamic programming in multi-skill call centers
WSC '05 Proceedings of the 37th conference on Winter simulation
Function-approximation-based perfect control variates for pricing American options
WSC '05 Proceedings of the 37th conference on Winter simulation
Neural Processing Letters
The Role of Problem Classification in Online Meta-cognition
IAT '06 Proceedings of the IEEE/WIC/ACM international conference on Intelligent Agent Technology
Neural Networks - 2006 Special issue: Neurobiology of decision making
Learning tetris using the noisy cross-entropy method
Neural Computation
Neural Network Feedback Control: Work at UTA's Automation and Robotics Research Institute
Journal of Intelligent and Robotic Systems
Dimensions of complexity of intelligent agents
PCAR '06 Proceedings of the 2006 international symposium on Practical cognitive agents and robots
Asymptotic Variance Of Passage Time Estimators In Markov Chains
Probability in the Engineering and Informational Sciences
Mathematics of Operations Research
Approximate Solutions of a Dynamic Forecast-Inventory Model
Manufacturing & Service Operations Management
Models of the Spiral-Down Effect in Revenue Management
Operations Research
Bias and Variance Approximation in Value Function Estimates
Management Science
A robust Markov game controller for nonlinear systems
Applied Soft Computing
On the Empirical State-Action Frequencies in Markov Decision Processes Under General Policies
Mathematics of Operations Research
Retailer-Supplier Flexible Commitments Contracts: A Robust Optimization Approach
Manufacturing & Service Operations Management
A Price-Directed Approach to Stochastic Inventory/Routing
Operations Research
Robust Control of Markov Decision Processes with Uncertain Transition Matrices
Operations Research
The Dynamic Assignment Problem
Transportation Science
The Journal of Machine Learning Research
Collaborative Multiagent Reinforcement Learning by Payoff Propagation
The Journal of Machine Learning Research
STEWARD: demo of spatio-textual extraction on the web aiding the retrieval of documents
dg.o '07 Proceedings of the 8th annual international conference on Digital government research: bridging disciplines & domains
Point-Based Value Iteration for Continuous POMDPs
The Journal of Machine Learning Research
Reinforcement Learning, Spike-Time-Dependent Plasticity, and the BCM Rule
Neural Computation
Bayesian actor-critic algorithms
Proceedings of the 24th international conference on Machine learning
Automatic shaping and decomposition of reward functions
Proceedings of the 24th international conference on Machine learning
Learning to trade with insider information
Proceedings of the ninth international conference on Electronic commerce
A framework for meta-level control in multi-agent systems
Autonomous Agents and Multi-Agent Systems
Shaping multi-agent systems with gradient reinforcement learning
Autonomous Agents and Multi-Agent Systems
Efficient PAC Learning for Episodic Tasks with Acyclic State Spaces
Discrete Event Dynamic Systems
Design of a peer-to-peer system for optimized content replication
Computer Communications
Optimal Sequential Exploration: A Binary Learning Model
Decision Analysis
Convergence Analysis of Batch Gradient Algorithm for Three Classes of Sigma-Pi Neural Networks
Neural Processing Letters
Application of reinforcement learning to the game of Othello
Computers and Operations Research
IEEE Transactions on Parallel and Distributed Systems
EURASIP Journal on Wireless Communications and Networking
Efficient sampling in approximate dynamic programming algorithms
Computational Optimization and Applications
Universal Intelligence: A Definition of Machine Intelligence
Minds and Machines
Probabilistic incremental program evolution
Evolutionary Computation
Restricted gradient-descent algorithm for value-function approximation in reinforcement learning
Artificial Intelligence
The optimizing-simulator: merging simulation and optimization using approximate dynamic programming
Proceedings of the 39th conference on Winter simulation: 40 years! The best is yet to come
Neurocomputing
RL-MAC: a reinforcement learning based MAC protocol for wireless sensor networks
International Journal of Sensor Networks
Dynamic modeling and control of supply chain systems: A review
Computers and Operations Research
Biologically-inspired adaptive learning control strategies: A rough set approach
International Journal of Hybrid Intelligent Systems
Approximate dynamic programming for link scheduling in wireless mesh networks
Computers and Operations Research
Knowledge propagation in a distributed omnidirectional vision system
Journal of Intelligent & Fuzzy Systems: Applications in Engineering and Technology - Marco Somalvico Memorial Issue
Error bounds of optimization algorithms for semi-Markov decision processes
International Journal of Systems Science
Self-Optimizing Memory Controllers: A Reinforcement Learning Approach
ISCA '08 Proceedings of the 35th Annual International Symposium on Computer Architecture
Reinforcement learning in the presence of rare events
Proceedings of the 25th international conference on Machine learning
Non-parametric policy gradients: a unified treatment of propositional and relational domains
Proceedings of the 25th international conference on Machine learning
Proceedings of the 25th international conference on Machine learning
An analysis of reinforcement learning with function approximation
Proceedings of the 25th international conference on Machine learning
A semiparametric statistical approach to model-free policy evaluation
Proceedings of the 25th international conference on Machine learning
Learning Control Knowledge for Forward Search Planning
The Journal of Machine Learning Research
Finite-Time Bounds for Fitted Value Iteration
The Journal of Machine Learning Research
Mapping land cover from detailed aerial photography data using textural and neural network analysis
International Journal of Remote Sensing
Controlling deliberation in a Markov decision process-based agent
Proceedings of the 7th international joint conference on Autonomous agents and multiagent systems - Volume 1
Interaction-driven Markov games for decentralized multiagent planning under uncertainty
Proceedings of the 7th international joint conference on Autonomous agents and multiagent systems - Volume 1
Reinforcement Learning in Nonstationary Environment Navigation Tasks
CAI '07 Proceedings of the 20th conference of the Canadian Society for Computational Studies of Intelligence on Advances in Artificial Intelligence
Making a Robot Learn to Play Soccer Using Reward and Punishment
KI '07 Proceedings of the 30th annual German conference on Advances in Artificial Intelligence
Simulation-Based Optimization Approach for Software Cost Model with Rejuvenation
ATC '08 Proceedings of the 5th international conference on Autonomic and Trusted Computing
A Learning Automata Approach to Multi-agent Policy Gradient Learning
KES '08 Proceedings of the 12th international conference on Knowledge-Based Intelligent Information and Engineering Systems, Part II
Evolution Strategies for Direct Policy Search
Proceedings of the 10th international conference on Parallel Problem Solving from Nature: PPSN X
Fitted Natural Actor-Critic: A New Algorithm for Continuous State-Action MDPs
ECML PKDD '08 Proceedings of the European conference on Machine Learning and Knowledge Discovery in Databases - Part II
Dynamic and Neuro-Dynamic Optimization of a Fed-Batch Fermentation Process
AIMSA '08 Proceedings of the 13th international conference on Artificial Intelligence: Methodology, Systems, and Applications
ISNN '08 Proceedings of the 5th international symposium on Neural Networks: Advances in Neural Networks, Part II
Value Function Based Reinforcement Learning in Changing Markovian Environments
The Journal of Machine Learning Research
Learning the Filling Policy of a Biodegradation Process by Fuzzy Actor---Critic Learning Methodology
MICAI '08 Proceedings of the 7th Mexican International Conference on Artificial Intelligence: Advances in Artificial Intelligence
Tank Model Coupled with an Artificial Neural Network
MICAI '08 Proceedings of the 7th Mexican International Conference on Artificial Intelligence: Advances in Artificial Intelligence
Optimistic Planning of Deterministic Systems
Recent Advances in Reinforcement Learning
Policy Iteration for Learning an Exercise Policy for American Options
Recent Advances in Reinforcement Learning
New Error Bounds for Approximations from Projected Linear Equations
Recent Advances in Reinforcement Learning
Markov Decision Processes with Arbitrary Reward Processes
Recent Advances in Reinforcement Learning
A New Learning Algorithm for Optimal Stopping
Discrete Event Dynamic Systems
Optimal parameter trajectory estimation in parameterized SDEs: An algorithmic procedure
ACM Transactions on Modeling and Computer Simulation (TOMACS)
Learning While Optimizing an Unknown Fitness Surface
Learning and Intelligent Optimization
A spiking neural network model of an actor-critic learning agent
Neural Computation
The factored policy-gradient planner
Artificial Intelligence
Practical solution techniques for first-order MDPs
Artificial Intelligence
Factored value iteration converges
Acta Cybernetica
Factored temporal difference learning in the new ties environment
Acta Cybernetica
Some topics for simulation optimization
Proceedings of the 40th Conference on Winter Simulation
Approximate dynamic programming: lessons from the field
Proceedings of the 40th Conference on Winter Simulation
Projected equation methods for approximate solution of large linear systems
Journal of Computational and Applied Mathematics
Gaussian process dynamic programming
Neurocomputing
Reinforcement distribution in fuzzy Q-learning
Fuzzy Sets and Systems
Dynamic routing policies for multiskill call centers
Probability in the Engineering and Informational Sciences
Opportunistic Transmission over Randomly Varying Channels
Network Control and Optimization
Interfaces
An Optimal Approximate Dynamic Programming Algorithm for the Lagged Asset Acquisition Problem
Mathematics of Operations Research
Simultaneous Optimal Control and Discrete Stochastic Sensor Selection
HSCC '09 Proceedings of the 12th International Conference on Hybrid Systems: Computation and Control
Reinforcement Learning: A Tutorial Survey and Recent Advances
INFORMS Journal on Computing
The optimizing-simulator: An illustration using the military airlift problem
ACM Transactions on Modeling and Computer Simulation (TOMACS)
A reinforcement learning framework for utility-based scheduling in resource-constrained systems
Future Generation Computer Systems
Constraint relaxation in approximate linear programs
ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Model-free reinforcement learning as mixture learning
ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Performance Evaluation of Direct Heuristic Dynamic Programming using Control-Theoretic Measures
Journal of Intelligent and Robotic Systems
A POMDP framework for coordinated guidance of autonomous UAVs for multitarget tracking
EURASIP Journal on Advances in Signal Processing - Special issue on signal processing advances in robots and autonomy
Online exploration in least-squares policy iteration
Proceedings of The 8th International Conference on Autonomous Agents and Multiagent Systems - Volume 2
Learning of coordination: exploiting sparse interactions in multiagent systems
Proceedings of The 8th International Conference on Autonomous Agents and Multiagent Systems - Volume 2
Scheduling policy design for autonomic systems
International Journal of Autonomous and Adaptive Communications Systems
Reordering Sparsification of Kernel Machines in Approximate Policy Iteration
ISNN 2009 Proceedings of the 6th International Symposium on Neural Networks: Advances in Neural Networks - Part II
Reinforcement Learning Control of a Real Mobile Robot Using Approximate Policy Iteration
ISNN 2009 Proceedings of the 6th International Symposium on Neural Networks: Advances in Neural Networks - Part III
Direct Policy Search Reinforcement Learning for Robot Control
Proceedings of the 2005 conference on Artificial Intelligence Research and Development
An Agent-Based Model for the Adaptation of Processing Efficiency for Prioritized Traffic
KES-AMSTA '09 Proceedings of the Third KES International Symposium on Agent and Multi-Agent Systems: Technologies and Applications
Reinforcement learning for robot soccer
Autonomous Robots
Partially Observable Markov Decision Process Approximations for Adaptive Sensing
Discrete Event Dynamic Systems
Evolutionary-based learning of generalised policies for AI planning domains
Proceedings of the 11th Annual conference on Genetic and evolutionary computation
Apply ant colony optimization to Tetris
Proceedings of the 11th Annual conference on Genetic and evolutionary computation
Multiscale Anticipatory Behavior by Hierarchical Reinforcement Learning
Anticipatory Behavior in Adaptive Learning Systems
Learning Representation and Control in Markov Decision Processes: New Frontiers
Foundations and Trends® in Machine Learning
Randomized shortest-path problems: Two related models
Neural Computation
Reinforcement learning for a CPG-driven biped robot
AAAI'04 Proceedings of the 19th national conference on Artifical intelligence
Learning representation and control in continuous Markov decision processes
AAAI'06 proceedings of the 21st national conference on Artificial intelligence - Volume 2
Focused real-time dynamic programming for MDPs: squeezing more out of a heuristic
AAAI'06 proceedings of the 21st national conference on Artificial intelligence - Volume 2
Interactively shaping agents via human reinforcement: the TAMER framework
Proceedings of the fifth international conference on Knowledge capture
Online Markov Decision Processes
Mathematics of Operations Research
Markov Decision Processes with Arbitrary Reward Processes
Mathematics of Operations Research
Optimal Online Learning Procedures for Model-Free Policy Evaluation
ECML PKDD '09 Proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases: Part II
Samuel meets Amarel: automating value function approximation using global state space analysis
AAAI'05 Proceedings of the 20th national conference on Artificial intelligence - Volume 2
Error bounds for approximate value iteration
AAAI'05 Proceedings of the 20th national conference on Artificial intelligence - Volume 2
Lazy approximation for solving continuous finite-horizon MDPs
AAAI'05 Proceedings of the 20th national conference on Artificial intelligence - Volume 3
Compact spectral bases for value function approximation using Kronecker factorization
AAAI'07 Proceedings of the 22nd national conference on Artificial intelligence - Volume 1
Hierarchical reinforcement learning with the MAXQ value function decomposition
Journal of Artificial Intelligence Research
Optimizing dialogue management with reinforcement learning: experiments with the NJFun system
Journal of Artificial Intelligence Research
Efficient reinforcement learning using recursive least-squares methods
Journal of Artificial Intelligence Research
Potential-based shaping and Q-value initialization are equivalent
Journal of Artificial Intelligence Research
Efficient solution algorithms for factored MDPs
Journal of Artificial Intelligence Research
Accelerating reinforcement learning through implicit imitation
Journal of Artificial Intelligence Research
Risk-sensitive reinforcement learning applied to control under constraints
Journal of Artificial Intelligence Research
Perseus: randomized point-based value iteration for POMDPs
Journal of Artificial Intelligence Research
Journal of Artificial Intelligence Research
Solving factored MDPs with hybrid state and action variables
Journal of Artificial Intelligence Research
Anytime point-based approximations for large POMDPs
Journal of Artificial Intelligence Research
Closed-loop learning of visual control policies
Journal of Artificial Intelligence Research
Learning to play using low-complexity rule-based policies: illustrations through Ms. Pac-Man
Journal of Artificial Intelligence Research
Adaptive stochastic resource control: a machine learning approach
Journal of Artificial Intelligence Research
Learning partially observable deterministic action models
Journal of Artificial Intelligence Research
A heuristic search approach to planning with continuous resources in stochastic domains
Journal of Artificial Intelligence Research
AntNet: distributed stigmergetic control for communications networks
Journal of Artificial Intelligence Research
Infinite-horizon policy-gradient estimation
Journal of Artificial Intelligence Research
Experiments with infinite-horizon, policy-gradient estimation
Journal of Artificial Intelligence Research
Sequential optimality and coordination in multiagent systems
IJCAI'99 Proceedings of the 16th international joint conference on Artifical intelligence - Volume 1
Convergence of reinforcement learning with general function approximators
IJCAI'99 Proceedings of the 16th international joint conference on Artificial intelligence - Volume 2
A decision-theoretic model of assistance
IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Using learned policies in heuristic-search planning
IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Hierarchical heuristic forward search in Stochastic domains
IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
An analysis of Laplacian methods for value function approximation in MDPs
IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
IJCAI'03 Proceedings of the 18th international joint conference on Artificial intelligence
Approximate policy iteration using large-margin classifiers
IJCAI'03 Proceedings of the 18th international joint conference on Artificial intelligence
A bilinear programming approach for multiagent planning
Journal of Artificial Intelligence Research
Solving factored MDPs via non-homogeneous partitioning
IJCAI'01 Proceedings of the 17th international joint conference on Artificial intelligence - Volume 1
Symbolic dynamic programming for first-order MDPs
IJCAI'01 Proceedings of the 17th international joint conference on Artificial intelligence - Volume 1
From Q(λ) to average Q-learning: efficient implementation of an asymptotic approximation
IJCAI'01 Proceedings of the 17th international joint conference on Artificial intelligence - Volume 2
Exploiting multiple secondary reinforcers in policy gradient reinforcement learning
IJCAI'01 Proceedings of the 17th international joint conference on Artificial intelligence - Volume 2
IJCAI'05 Proceedings of the 19th international joint conference on Artificial intelligence
Probabilistic reasoning for plan robustness
IJCAI'05 Proceedings of the 19th international joint conference on Artificial intelligence
Solving POMDPs with continuous or large discrete observation spaces
IJCAI'05 Proceedings of the 19th international joint conference on Artificial intelligence
An MCMC approach to solving hybrid factored MDPs
IJCAI'05 Proceedings of the 19th international joint conference on Artificial intelligence
On solving the Lagrangian dual of integer programs via an incremental approach
Computational Optimization and Applications
Intensional dynamic programming. A Rosetta stone for structured dynamic programming
Journal of Algorithms
Assured end-to-end QoS through adaptive marking in multi-domain differentiated services networks
Computer Communications
An analytic modelling approach for network routing algorithms that use "ant-like" mobile agents
Computer Networks: The International Journal of Computer and Telecommunications Networking
Reinforcement learning versus model predictive control: a comparison on a power system problem
IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics
IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics
Neural network output optimization using interval analysis
IEEE Transactions on Neural Networks
Boundedness and convergence of online gradient method with penalty for feedforward neural networks
IEEE Transactions on Neural Networks
IEEE Transactions on Neural Networks
A Q-learning approach to derive optimal consumption and investment strategies
IEEE Transactions on Neural Networks
A Computational Model of Social-Learning Mechanisms
Adaptive Behavior - Animals, Animats, Software Agents, Robots, Adaptive Systems
Solving POMDPs: RTDP-bel vs. point-based algorithms
IJCAI'09 Proceedings of the 21st international jont conference on Artifical intelligence
ReTrASE: integrating paradigms for approximate probabilistic planning
IJCAI'09 Proceedings of the 21st international jont conference on Artifical intelligence
Training parsers by inverse reinforcement learning
Machine Learning
Reinforcement learning and adaptive dynamic programming for feedback control
IEEE Circuits and Systems Magazine
Customized learning algorithms for episodic tasks withacyclic state spaces
CASE'09 Proceedings of the fifth annual IEEE international conference on Automation science and engineering
Stochastic model for outcome prediction in acute illness
Computers in Biology and Medicine
Online learning in Markov decision processes with arbitrarily changing rewards and transitions
GameNets'09 Proceedings of the First ICST international conference on Game Theory for Networks
An Additive Reinforcement Learning
ICANN '09 Proceedings of the 19th International Conference on Artificial Neural Networks: Part I
ADT '09 Proceedings of the 1st International Conference on Algorithmic Decision Theory
Reinforcement Learning Based Web Service Compositions for Mobile Business
WISM '09 Proceedings of the International Conference on Web Information Systems and Mining
A reinforcement learning framework for utility-based scheduling in resource-constrained systems
A reinforcement learning framework for utility-based scheduling in resource-constrained systems
A reinforcement learning approach to dynamic resource allocation
A reinforcement learning approach to dynamic resource allocation
Constrained controller design for a class of nonlinear discrete-time uncertain systems
ACC'09 Proceedings of the 2009 conference on American Control Conference
Approximate dynamic programming using Bellman residual elimination and Gaussian process regression
ACC'09 Proceedings of the 2009 conference on American Control Conference
Efficient suboptimal solutions of switched LQR problems
ACC'09 Proceedings of the 2009 conference on American Control Conference
icLQG: combining local and global optimization for control in information space
ICRA'09 Proceedings of the 2009 IEEE international conference on Robotics and Automation
Online actor critic algorithm to solve the continuous-time infinite horizon optimal control problem
IJCNN'09 Proceedings of the 2009 international joint conference on Neural Networks
IJCNN'09 Proceedings of the 2009 international joint conference on Neural Networks
Generalized policy iteration for continuous-time systems
IJCNN'09 Proceedings of the 2009 international joint conference on Neural Networks
An alpha derivative formulation of the Hamilton-Jacobi-Bellman equation of dynamic programming
IJCNN'09 Proceedings of the 2009 international joint conference on Neural Networks
Computational intelligence for structured learning of a partner robot based on imitation
Information Sciences: an International Journal
On the evolution of artificial Tetris players
CIG'09 Proceedings of the 5th international conference on Computational Intelligence and Games
R-POPTVR: a novel reinforcement-based POPTVR fuzzy neural network for pattern classification
IEEE Transactions on Neural Networks
Adaptive dynamic programming: an introduction
IEEE Computational Intelligence Magazine
Percentile Optimization for Markov Decision Processes with Parameter Uncertainty
Operations Research
Restless watchdog: selective quickest spectrum sensing in multichannel cognitive radio systems
EURASIP Journal on Advances in Signal Processing - Special issue on dynamic spectrum access for wireless networking
The improvement of Q-learning applied to imperfect information game
SMC'09 Proceedings of the 2009 IEEE international conference on Systems, Man and Cybernetics
Rejoinder---The Languages of Stochastic Optimization
INFORMS Journal on Computing
Probability in the Engineering and Informational Sciences
A framework for the design of a military operational supply network
CISDA'09 Proceedings of the Second IEEE international conference on Computational intelligence for security and defense applications
QoS differentiated and fair packet scheduling in broadband wireless access networks
EURASIP Journal on Wireless Communications and Networking - Special issue on broadband wireless access
Online adaptive policies for ensemble classifiers
Neurocomputing
Fuzzy decision tree function approximation in reinforcement learning
International Journal of Artificial Intelligence and Soft Computing
RL-Glue: Language-Independent Software for Reinforcement-Learning Experiments
The Journal of Machine Learning Research
On the connections between PCTL and dynamic programming
Proceedings of the 13th ACM international conference on Hybrid systems: computation and control
A Convergent Online Single Time Scale Actor Critic Algorithm
The Journal of Machine Learning Research
Reinforcement learning as a means of dynamic aggregate QoS provisioning
Art-QoS'03 Proceedings of the 2003 international conference on Architectures for quality of service in the internet
Computational approaches to reachability analysis of stochastic hybrid systems
HSCC'07 Proceedings of the 10th international conference on Hybrid systems: computation and control
Parallelizing parallel rollout algorithm for solving Markov decision processes
WOMPAT'03 Proceedings of the OpenMP applications and tools 2003 international conference on OpenMP shared memory parallel programming
A POMDP approximation algorithm that anticipates the need to observe
PRICAI'00 Proceedings of the 6th Pacific Rim international conference on Artificial intelligence
Near sets: toward approximation space-based object recognition
RSKT'07 Proceedings of the 2nd international conference on Rough sets and knowledge technology
Learning autonomous behaviours for non-holonomic vehicles
IWANN'07 Proceedings of the 9th international work conference on Artificial neural networks
Q-learning with linear function approximation
COLT'07 Proceedings of the 20th annual conference on Learning theory
Learning models of relational MDPs using graph kernels
MICAI'07 Proceedings of the artificial intelligence 6th Mexican international conference on Advances in artificial intelligence
Solutions to real-world instances of PSPACE-complete stacking
ESA'07 Proceedings of the 15th annual European conference on Algorithms
Validation of a reinforcement learning policy for dosage optimization of erythropoietin
AI'07 Proceedings of the 20th Australian joint conference on Advances in artificial intelligence
Rollout strategy-based probabilistic causal model approach for the multiple fault diagnosis
Robotics and Computer-Integrated Manufacturing
ICIC'09 Proceedings of the Intelligent computing 5th international conference on Emerging intelligent computing technology and applications
Approximate Dynamic Programming for Ambulance Redeployment
INFORMS Journal on Computing
Feature discovery in reinforcement learning using genetic programming
EuroGP'08 Proceedings of the 11th European conference on Genetic programming
A relational hierarchical model for decision-theoretic assistance
ILP'07 Proceedings of the 17th international conference on Inductive logic programming
Relational reinforcement learning for agents in worlds with objects
Adaptive agents and multi-agent systems
Reward-modulated hebbian learning of decision making
Neural Computation
Improving optimistic exploration in model-free reinforcement learning
ICANNGA'09 Proceedings of the 9th international conference on Adaptive and natural computing algorithms
Bounds for multistage stochastic programs using supervised learning strategies
SAGA'09 Proceedings of the 5th international conference on Stochastic algorithms: foundations and applications
IEEE Transactions on Evolutionary Computation
Joint connection admission control and routing in IEEE 802.16-based mesh networks
IEEE Transactions on Wireless Communications
A general framework to detect unsafe system states from multisensor data stream
IEEE Transactions on Intelligent Transportation Systems
Error Bounds for Approximations from Projected Linear Equations
Mathematics of Operations Research
Linearly Parameterized Bandits
Mathematics of Operations Research
PAC-MDP learning with knowledge-based admissible models
Proceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems: volume 1 - Volume 1
Cultivating desired behaviour: policy teaching via environment-dynamics tweaks
Proceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems: volume 1 - Volume 1
Training a Tetris agent via interactive shaping: a demonstration of the TAMER framework
Proceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems: volume 1 - Volume 1
Information Relaxations and Duality in Stochastic Dynamic Programs
Operations Research
Coordinated learning in multiagent MDPs with infinite state-space
Autonomous Agents and Multi-Agent Systems
Opportunistic Fair Scheduling in Wireless Networks: An Approximate Dynamic Programming Approach
Mobile Networks and Applications
ACM Transactions on Modeling and Computer Simulation (TOMACS)
Computational Optimization and Applications
IEEE Transactions on Neural Networks
Functional Optimization Through Semilocal Approximate Minimization
Operations Research
Steady-state genetic algorithms for growing topological mapping and localization
PRICAI'10 Proceedings of the 11th Pacific Rim international conference on Trends in artificial intelligence
Adaptive bases for reinforcement learning
ECML PKDD'10 Proceedings of the 2010 European conference on Machine learning and knowledge discovery in databases: Part I
Adaptive critic design with ESN critic for bioprocess optimization
ICANN'10 Proceedings of the 20th international conference on Artificial neural networks: Part II
On the potential of process simulation in software project schedule optimization
COMPSAC-W'05 Proceedings of the 29th annual international conference on Computer software and applications conference
Simultaneous learning of perception and action in mobile robots
Robotics and Autonomous Systems
Automatic induction of bellman-error features for probabilistic planning
Journal of Artificial Intelligence Research
Pagerank optimization in polynomial time by stochastic shortest path reformulation
ALT'10 Proceedings of the 21st international conference on Algorithmic learning theory
Combining reinforcement learning with symbolic planning
ALAMAS'05/ALAMAS'06/ALAMAS'07 Proceedings of the 5th , 6th and 7th European conference on Adaptive and learning agents and multi-agent systems: adaptation and multi-agent learning
Adaptive traffic signal control using vehicle-to-infrastructure communication: a technical note
Proceedings of the Second International Workshop on Computational Transportation Science
Ranking policies in discrete Markov decision processes
Annals of Mathematics and Artificial Intelligence
Stochastic approximation algorithms for constrained optimization via simulation
ACM Transactions on Modeling and Computer Simulation (TOMACS)
Stochastic control via direct comparison
Discrete Event Dynamic Systems
An Improved Dynamic Programming Decomposition Approach for Network Revenue Management
Manufacturing & Service Operations Management
Learning visual representations for perception-action systems
International Journal of Robotics Research
Journal of Artificial Intelligence Research
A CMOS current-mode dynamic programming circuit
IEEE Transactions on Circuits and Systems Part I: Regular Papers - Special section on 2009 IEEE system-on-chip conference
Hessian matrix distribution for Bayesian policy gradient reinforcement learning
Information Sciences: an International Journal
Sampled fictitious play for approximate dynamic programming
Computers and Operations Research
Adaptive modulation with smoothed flow utility
EURASIP Journal on Wireless Communications and Networking
Decentralized MDPs with sparse interactions
Artificial Intelligence
Reinforcement learning for model building and variance-penalized control
Winter Simulation Conference
Winter Simulation Conference
Ambulance redeployment: an approximate dynamic programming approach
Winter Simulation Conference
Declarative programming for agent applications
Autonomous Agents and Multi-Agent Systems
Optimization of heuristic search using recursive algorithm selection and reinforcement learning
Annals of Mathematics and Artificial Intelligence
A dynamic programming strategy to balance exploration and exploitation in the bandit problem
Annals of Mathematics and Artificial Intelligence
Decentralized activation in sensor networks - global games and adaptive filtering games
Digital Signal Processing
Learning to manage combined energy supply systems
Proceedings of the 17th IEEE/ACM international symposium on Low-power electronics and design
The Journal of Machine Learning Research
Reinforcement learning and the Bayesian control rule
AGI'11 Proceedings of the 4th international conference on Artificial general intelligence
The 10th International Conference on Autonomous Agents and Multiagent Systems - Volume 3
Network Cargo Capacity Management
Operations Research
The Effect of Robust Decisions on the Cost of Uncertainty in Military Airlift Operations
ACM Transactions on Modeling and Computer Simulation (TOMACS)
Computing Bid Prices for Revenue Management Under Customer Choice Behavior
Manufacturing & Service Operations Management
A framework and a mean-field algorithm for the local control of spatial processes
International Journal of Approximate Reasoning
Self-teaching adaptive dynamic programming for Gomoku
Neurocomputing
Learning finite-state controllers for partially observable environments
UAI'99 Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence
Planning under continuous time and resource uncertainty: a challenge for AI
UAI'02 Proceedings of the Eighteenth conference on Uncertainty in artificial intelligence
The thing that we tried didn't work very well: deictic representation in reinforcement learning
UAI'02 Proceedings of the Eighteenth conference on Uncertainty in artificial intelligence
Value function approximation in zero-sum markov games
UAI'02 Proceedings of the Eighteenth conference on Uncertainty in artificial intelligence
Policy iteration for factored MDPs
UAI'00 Proceedings of the Sixteenth conference on Uncertainty in artificial intelligence
Robust combination of local controllers
UAI'01 Proceedings of the Seventeenth conference on Uncertainty in artificial intelligence
Structured reachability analysis for Markov decision processes
UAI'98 Proceedings of the Fourteenth conference on Uncertainty in artificial intelligence
Hierarchical solution of Markov decision processes using macro-actions
UAI'98 Proceedings of the Fourteenth conference on Uncertainty in artificial intelligence
Hierarchical Knowledge Gradient for Sequential Sampling
The Journal of Machine Learning Research
Robust Approximate Bilinear Programming for Value Function Approximation
The Journal of Machine Learning Research
Consistency of Sequential Bayesian Sampling Policies
SIAM Journal on Control and Optimization
Theory and Applications of Robust Optimization
SIAM Review
Quantum reinforcement learning
ICNC'05 Proceedings of the First international conference on Advances in Natural Computation - Volume Part II
CEEMAS'05 Proceedings of the 4th international Central and Eastern European conference on Multi-Agent Systems and Applications
General discounting versus average reward
ALT'06 Proceedings of the 17th international conference on Algorithmic Learning Theory
Probabilistic generalization of simple grammars and its application to reinforcement learning
ALT'06 Proceedings of the 17th international conference on Algorithmic Learning Theory
Approximate policy iteration for closed-loop learning of visual tasks
ECML'06 Proceedings of the 17th European conference on Machine Learning
Task-Driven discretization of the joint space of visual percepts and continuous actions
ECML'06 Proceedings of the 17th European conference on Machine Learning
Solving uncertain markov decision problems: an interval-based method
ICNC'06 Proceedings of the Second international conference on Advances in Natural Computation - Volume Part II
Load protection model based on intelligent agent regulation
KES'06 Proceedings of the 10th international conference on Knowledge-Based Intelligent Information and Engineering Systems - Volume Part I
Symbolic generalization for on-line planning
UAI'03 Proceedings of the Nineteenth conference on Uncertainty in Artificial Intelligence
Adaptive utility-based scheduling in resource-constrained systems
AI'05 Proceedings of the 18th Australian Joint conference on Advances in Artificial Intelligence
COLT'06 Proceedings of the 19th annual conference on Learning Theory
A machine learning approach to intraday trading on foreign exchange markets
IDEAL'05 Proceedings of the 6th international conference on Intelligent Data Engineering and Automated Learning
Scheduling of re-entrant lines with neuro-dynamic programming based on a new evaluating criterion
ISNN'06 Proceedings of the Third international conference on Advances in Neural Networks - Volume Part III
Aspects of optimal viewpoint selection and viewpoint fusion
ACCV'06 Proceedings of the 7th Asian conference on Computer Vision - Volume Part II
Grey reinforcement learning for incomplete information processing
TAMC'06 Proceedings of the Third international conference on Theory and Applications of Models of Computation
Optimal tuning of continual online exploration in reinforcement learning
ICANN'06 Proceedings of the 16th international conference on Artificial Neural Networks - Volume Part I
Feature extraction for decision-theoretic planning in partially observable environments
ICANN'06 Proceedings of the 16th international conference on Artificial Neural Networks - Volume Part I
Reinforcement learning with echo state networks
ICANN'06 Proceedings of the 16th international conference on Artificial Neural Networks - Volume Part I
Towards finite-sample convergence of direct reinforcement learning
ECML'05 Proceedings of the 16th European conference on Machine Learning
ECML'05 Proceedings of the 16th European conference on Machine Learning
Feature-Discovering approximate value iteration methods
SARA'05 Proceedings of the 6th international conference on Abstraction, Reformulation and Approximation
CBR for state value function approximation in reinforcement learning
ICCBR'05 Proceedings of the 6th international conference on Case-Based Reasoning Research and Development
ICN'05 Proceedings of the 4th international conference on Networking - Volume Part II
The effect of alteration in service environments with distributed intelligent agents
KES'05 Proceedings of the 9th international conference on Knowledge-Based Intelligent Information and Engineering Systems - Volume Part III
Q-Learning and Enhanced Policy Iteration in Discounted Dynamic Programming
Mathematics of Operations Research
Multi-agent case-based reasoning for cooperative reinforcement learners
ECCBR'06 Proceedings of the 8th European conference on Advances in Case-Based Reasoning
AI'06 Proceedings of the 19th international conference on Advances in Artificial Intelligence: Canadian Society for Computational Studies of Intelligence
Adaptive opportunistic routing for wireless ad hoc networks
IEEE/ACM Transactions on Networking (TON)
Admission control policies for a multi-class QoS-aware service oriented architecture
ACM SIGMETRICS Performance Evaluation Review
Basis function discovery using spectral clustering and bisimulation metrics
ALA'11 Proceedings of the 11th international conference on Adaptive and Learning Agents
Topological value iteration algorithms
Journal of Artificial Intelligence Research
Stochastic enforced hill-climbing
Journal of Artificial Intelligence Research
Brief paper: Average cost temporal-difference learning
Automatica (Journal of IFAC)
Monte Carlo TD(λ)-methods for the optimal control of discrete-time Markovian jump linear systems
Automatica (Journal of IFAC)
Automatica (Journal of IFAC)
Book reviews: Self-learning control of finite Markov chains
Automatica (Journal of IFAC)
A time aggregation approach to Markov decision processes
Automatica (Journal of IFAC)
Book review: Stochastic controls-Hamiltonian systems and HJB equations
Automatica (Journal of IFAC)
Distributionally Robust Markov Decision Processes
Mathematics of Operations Research
DetH: approximate hierarchical solution of large Markov decision processes
IJCAI'11 Proceedings of the Twenty-Second international joint conference on Artificial Intelligence - Volume Volume Three
Automatic construction of efficient multiple battery usage policies
IJCAI'11 Proceedings of the Twenty-Second international joint conference on Artificial Intelligence - Volume Volume Three
Discovering hidden structure in factored MDPs
Artificial Intelligence
Approximate Dynamic Programming via a Smoothed Linear Program
Operations Research
Feature reinforcement learning in practice
EWRL'11 Proceedings of the 9th European conference on Recent Advances in Reinforcement Learning
ℓ1-Penalized projected bellman residual
EWRL'11 Proceedings of the 9th European conference on Recent Advances in Reinforcement Learning
Recursive least-squares learning with eligibility traces
EWRL'11 Proceedings of the 9th European conference on Recent Advances in Reinforcement Learning
MapReduce for parallel reinforcement learning
EWRL'11 Proceedings of the 9th European conference on Recent Advances in Reinforcement Learning
Integrating a partial model into model free reinforcement learning
The Journal of Machine Learning Research
Multi-rate control policies for elastic traffic in CDMA networks
Performance Evaluation
A rapid sparsification method for kernel machines in approximate policy iteration
ISNN'12 Proceedings of the 9th international conference on Advances in Neural Networks - Volume Part I
ISNN'12 Proceedings of the 9th international conference on Advances in Neural Networks - Volume Part I
Proximity-based non-uniform abstractions for approximate planning
Journal of Artificial Intelligence Research
Plan-based policies for efficient multiple battery load management
Journal of Artificial Intelligence Research
Information Sciences: an International Journal
SMART: A Stochastic Multiscale Model for the Analysis of Energy Resources, Technology, and Policy
INFORMS Journal on Computing
Bayesian nonparametric inverse reinforcement learning
ECML PKDD'12 Proceedings of the 2012 European conference on Machine Learning and Knowledge Discovery in Databases - Volume Part II
Learning policies for battery usage optimization in electric vehicles
ECML PKDD'12 Proceedings of the 2012 European conference on Machine Learning and Knowledge Discovery in Databases - Volume Part II
Bayesian Learning of Noisy Markov Decision Processes
ACM Transactions on Modeling and Computer Simulation (TOMACS) - Special Issue on Monte Carlo Methods in Statistics
Upper confidence tree-based consistent reactive planning application to minesweeper
LION'12 Proceedings of the 6th international conference on Learning and Intelligent Optimization
Improving the exploration in upper confidence trees
LION'12 Proceedings of the 6th international conference on Learning and Intelligent Optimization
Adaptive value function approximation for continuous-state stochastic dynamic programming
Computers and Operations Research
Scheduling fighter aircraft maintenance with reinforcement learning
Proceedings of the Winter Simulation Conference
Stochastic policy search for variance-penalized semi-Markov control
Proceedings of the Winter Simulation Conference
Using approximate dynamic programming to optimize admission control in cloud computing environment
Proceedings of the Winter Simulation Conference
A sampled fictitious play based learning algorithm for infinite horizon Markov decision processes
Proceedings of the Winter Simulation Conference
Identifying effective policies in approximate dynamic programming: beyond regression
Proceedings of the Winter Simulation Conference
American option pricing with randomized quasi-Monte Carlo simulations
Proceedings of the Winter Simulation Conference
Reducing the learning time of tetris in evolution strategies
EA'11 Proceedings of the 10th international conference on Artificial Evolution
Two-step gradient-based reinforcement learning for underwater robotics behavior learning
Robotics and Autonomous Systems
An Actor-Critic based controller for glucose regulation in type 1 diabetes
Computer Methods and Programs in Biomedicine
Sourcing strategies in supply risk management: An approximate dynamic programming approach
Computers and Operations Research
Dynamic Capacity Allocation to Customers Who Remember Past Service
Management Science
An efficient L2-norm regularized least-squares temporal difference learning algorithm
Knowledge-Based Systems
Assessing the Value of Dynamic Pricing in Network Revenue Management
INFORMS Journal on Computing
Design with shape grammars and reinforcement learning
Advanced Engineering Informatics
Using informative behavior to increase engagement in the tamer framework
Proceedings of the 2013 international conference on Autonomous agents and multi-agent systems
On Boundedness of Q-Learning Iterates for Stochastic Shortest Path Problems
Mathematics of Operations Research
Performance bounds for λ policy iteration and application to the game of Tetris
The Journal of Machine Learning Research
Finite-sample analysis of least-squares policy iteration
The Journal of Machine Learning Research
The Journal of Machine Learning Research
Engineering Applications of Artificial Intelligence
Planning for multiple measurement channels in a continuous-state POMDP
Annals of Mathematics and Artificial Intelligence
Scenario Trees and Policy Selection for Multistage Stochastic Programming Using Machine Learning
INFORMS Journal on Computing
Probabilistic planning for continuous dynamic systems under bounded risk
Journal of Artificial Intelligence Research
A novel reinforcement learning architecture for continuous state and action spaces
Advances in Artificial Intelligence
Integrated task and motion planning in belief space
International Journal of Robotics Research
Low-discrepancy sampling for approximate dynamic programming with local approximators
Computers and Operations Research
Reinforcement learning algorithms with function approximation: Recent advances and applications
Information Sciences: an International Journal
General time consistent discounting
Theoretical Computer Science
Construction of approximation spaces for reinforcement learning
The Journal of Machine Learning Research
Multiagent meta-level control for radar coordination
Web Intelligence and Agent Systems
A tour of machine learning: An AI perspective
AI Communications - ECAI 2012 Turing and Anniversary Track
Hi-index | 0.02 |
From the Publisher:This is the first textbook that fully explains the neuro-dynamic programming/reinforcement learning methodology, which is a recent breakthrough in the practical application of neural networks and dynamic programming to complex problems of planning, optimal decision making, and intelligent control.