Planning and acting in partially observable stochastic domains

Authors:
Leslie Pack Kaelbling;Michael L. Littman;Anthony R. Cassandra
Affiliations:
Computer Science Department, Brown University, Box 1910, Providence, RI 02912-1910, USA and Computer Science Department, Brown University, Box 1910, Providence, RI 02912-1910, USA and Department o ...;Department of Computer Science, Duke University, Durham, NC 27708-0129, USA;Microelectronics and Computer Technology Corporation (MCC), 3500 West Balcones Center Drive, Austin, TX 78759-5398, USA
Venue:
Artificial Intelligence
Year:
1998

Citing 34
Cited 193

Theory of linear and integer programming

Theory of linear and integer programming
On the average cost optimality equation and the structure of optimal policies for partially observable Markov decision processes

Annals of Operations Research
A survey of algorithmic methods for partially observed Markov decision processes

Annals of Operations Research
Genetic programming: on the programming of computers by means of natural selection

Genetic programming: on the programming of computers by means of natural selection
Conditional nonlinear planning

Proceedings of the first international conference on Artificial intelligence planning systems
The complexity of stochastic games

Information and Computation
Memoryless policies: theoretical limitations and practical results

SAB94 Proceedings of the third international conference on Simulation of adaptive behavior : from animals to animats 3: from animals to animats 3
Acting optimally in partially observable stochastic domains

AAAI'94 Proceedings of the twelfth national conference on Artificial intelligence (vol. 2)
Cost-effective sensing during plan execution

AAAI'94 Proceedings of the twelfth national conference on Artificial intelligence (vol. 2)
Control strategies for a stochastic planner

AAAI'94 Proceedings of the twelfth national conference on Artificial intelligence (vol. 2)
Planning under time constraints in stochastic domains

Artificial Intelligence - Special volume on planning and scheduling
An algorithm for probabilistic planning

Artificial Intelligence - Special volume on planning and scheduling
The complexity of mean payoff games on graphs

Theoretical Computer Science
Fast planning through planning graph analysis

Artificial Intelligence
An improved policy iteration algorithm for partially observable MDPs

NIPS '97 Proceedings of the 1997 conference on Advances in neural information processing systems 10
Markov Decision Processes: Discrete Stochastic Dynamic Programming

Markov Decision Processes: Discrete Stochastic Dynamic Programming
Hidden Markov Model} Induction by Bayesian Model Merging

Advances in Neural Information Processing Systems 5, [NIPS Conference]
Efficient dynamic-programming updates in partially observable Markov decision processes

Efficient dynamic-programming updates in partially observable Markov decision processes
Optimal Probabilistic and Decision-Theoretic Planning using Markovian

Optimal Probabilistic and Decision-Theoretic Planning using Markovian
Algorithms for partially observable markov decision processes

Algorithms for partially observable markov decision processes
Exact and approximate algorithms for partially observable markov decision processes

Exact and approximate algorithms for partially observable markov decision processes
Planning for contingencies: a decision-based approach

Journal of Artificial Intelligence Research
Knowledge preconditions for actions and plans

IJCAI'87 Proceedings of the 10th international joint conference on Artificial intelligence - Volume 2
Universal plans for reactive robots in unpredictable environments

IJCAI'87 Proceedings of the 10th international joint conference on Artificial intelligence - Volume 2
Rewarding behaviors

AAAI'96 Proceedings of the thirteenth national conference on Artificial intelligence - Volume 2
Computing optimal policies for partially observable decision processes using compact representations

AAAI'96 Proceedings of the thirteenth national conference on Artificial intelligence - Volume 2
Anytime synthetic projection: maximizing the probability of goal satisfaction

AAAI'90 Proceedings of the eighth National conference on Artificial intelligence - Volume 1
Systematic nonlinear planning

AAAI'91 Proceedings of the ninth National conference on Artificial intelligence - Volume 2
Reinforcement learning with perceptual aliasing: the perceptual distinctions approach

AAAI'92 Proceedings of the tenth national conference on Artificial intelligence
Incremental pruning: a simple, fast, exact method for partially observable Markov decision processes

UAI'97 Proceedings of the Thirteenth conference on Uncertainty in artificial intelligence
Planning with external events

UAI'94 Proceedings of the Tenth international conference on Uncertainty in artificial intelligence
Epsilon-safe planning

UAI'94 Proceedings of the Tenth international conference on Uncertainty in artificial intelligence
A method for planning given uncertain and incomplete information

UAI'93 Proceedings of the Ninth international conference on Uncertainty in artificial intelligence
Solving H-horizon, stationary Markov decision problems in time proportional to log(H)

Operations Research Letters

Contingent planning under uncertainty via stochastic satisfiability

AAAI '99/IAAI '99 Proceedings of the sixteenth national conference on Artificial intelligence and the eleventh Innovative applications of artificial intelligence conference innovative applications of artificial intelligence
Learning a Navigation Task in Changing Environments by Multi-task Reinforcement Learning

EWLR-8 Proceedings of the 8th European Workshop on Learning Robots: Advances in Robot Learning
A Framework for Supporting Intelligent Fault and Performance Management for Communication Networks

MMNS '01 Proceedings of the 4th IFIP/IEEE International Conference on Management of Multimedia Networks and Services: Management of Multimedia on the Internet
Multi-agent VSK Logic

JELIA '00 Proceedings of the European Workshop on Logics in Artificial Intelligence
Using Classifier Systems as Adaptive Expert Systems for Control

IWLCS '00 Revised Papers from the Third International Workshop on Advances in Learning Classifier Systems
On Partially Observable MDPs and BDI Models

Selected papers from the UKMAS Workshop on Foundations and Applications of Multi-Agent Systems
Space-Progressive Value Iteration: An Anytime Algorithm for a Class of POMDPs

ECSQARU '01 Proceedings of the 6th European Conference on Symbolic and Quantitative Approaches to Reasoning with Uncertainty
Learning Time Allocation Using Neural Networks

CG '00 Revised Papers from the Second International Conference on Computers and Games
Learning to Execute Navigation Plans

KI '01 Proceedings of the Joint German/Austrian Conference on AI: Advances in Artificial Intelligence
Belief Update in the pGOLOG Framework

KI '01 Proceedings of the Joint German/Austrian Conference on AI: Advances in Artificial Intelligence
Face Recognition Using Foveal Vision

BMVC '00 Proceedings of the First IEEE International Workshop on Biologically Motivated Computer Vision
Some Effects of Individual Learning on the Evolution of Sensors

ECAL '01 Proceedings of the 6th European Conference on Advances in Artificial Life
Optimistic and Disjunctive Agent Design Problems

ATAL '00 Proceedings of the 7th International Workshop on Intelligent Agents VII. Agent Theories Architectures and Languages
Centralized Regulation of Social Exchanges Between Personality-Based Agents

Coordination, Organizations, Institutions, and Norms in Agent Systems II
Solving Large-Scale and Sparse-Reward DEC-POMDPs with Correlation-MDPs

RoboCup 2007: Robot Soccer World Cup XI
A Continuous Internal-State Controller for Partially Observable Markov Decision Processes

ICANN '08 Proceedings of the 18th international conference on Artificial Neural Networks, Part I
Episodic Reinforcement Learning by Logistic Reward-Weighted Regression

ICANN '08 Proceedings of the 18th international conference on Artificial Neural Networks, Part I
A Logical Framework to Reinforcement Learning Using Hybrid Probabilistic Logic Programs

SUM '08 Proceedings of the 2nd international conference on Scalable Uncertainty Management
Towards the Self-regulation of Personality-Based Social Exchange Processes in Multiagent Systems

SBIA '08 Proceedings of the 19th Brazilian Symposium on Artificial Intelligence: Advances in Artificial Intelligence
A Near Optimal Policy for Channel Allocation in Cognitive Radio

Recent Advances in Reinforcement Learning
Reinforcement Learning with the Use of Costly Features

Recent Advances in Reinforcement Learning
Variable Metric Reinforcement Learning Methods Applied to the Noisy Mountain Car Problem

Recent Advances in Reinforcement Learning
A Normative Model for Behavioral Differentiation

WI-IAT '08 Proceedings of the 2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology - Volume 02
Partially Observable Markov Decision Process (POMDP) Technologies for Sign Language Based Human-Computer Interaction

UAHCI '09 Proceedings of the 5th International Conference on Universal Access in Human-Computer Interaction. Part III: Applications and Services
Learning the Difference between Partially Observable Dynamical Systems

ECML PKDD '09 Proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases: Part II
Probabilistic Planning with Imperfect Sensing Actions Using Hybrid Probabilistic Logic Programs

SUM '09 Proceedings of the 3rd International Conference on Scalable Uncertainty Management
Reinforcement Learning in RoboCup KeepAway with Partial Observability

WI-IAT '09 Proceedings of the 2009 IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology - Volume 02
The Belief Roadmap: Efficient Planning in Belief Space by Factoring the Covariance

International Journal of Robotics Research
Strong planning under partial observability

Artificial Intelligence
Intensional dynamic programming. A Rosetta stone for structured dynamic programming

Journal of Algorithms
Misplaced item search in a warehouse using an RFID-based partially observable Markov decision process (POMDP) model

CASE'09 Proceedings of the fifth annual IEEE international conference on Automation science and engineering
A POMDP-based spectrum handoff protocol for partially observable cognitive radio networks

WCNC'09 Proceedings of the 2009 IEEE conference on Wireless Communications & Networking Conference
RL-Based Memory Controller for Scalable Autonomous Systems

ICONIP '09 Proceedings of the 16th International Conference on Neural Information Processing: Part II
A reinforcement learning framework for utility-based scheduling in resource-constrained systems

A reinforcement learning framework for utility-based scheduling in resource-constrained systems
A multi-agent framework for a hybrid dialog management system

ICME'09 Proceedings of the 2009 IEEE international conference on Multimedia and Expo
Architecture of behavior-based and robotics self-optimizing memory controller

ICRA'09 Proceedings of the 2009 IEEE international conference on Robotics and Automation
Smoothed Sarsa: reinforcement learning for robot delivery tasks

ICRA'09 Proceedings of the 2009 IEEE international conference on Robotics and Automation
Conformant plans and beyond: Principles and complexity

Artificial Intelligence
A POMDP approach to P300-based brain-computer interfaces

Proceedings of the 15th international conference on Intelligent user interfaces
Partially Observable Markov Decision Processes: A Geometric Technique and Analysis

Operations Research
Finding and exploiting goal opportunities in real-time during plan execution

IROS'09 Proceedings of the 2009 IEEE/RSJ international conference on Intelligent robots and systems
ISROBOTNET: a testbed for sensor and robot network systems

IROS'09 Proceedings of the 2009 IEEE/RSJ international conference on Intelligent robots and systems
Collision-probability constrained PRM for a manipulator with base pose uncertainty

IROS'09 Proceedings of the 2009 IEEE/RSJ international conference on Intelligent robots and systems
A Recursive Classifier System for Partially Observable Environments

Fundamenta Informaticae
Model-based reinforcement learning: a computational model and an fMRI study

Neurocomputing
CCMAC: Coordinated cooperative MAC for wireless LANs

Computer Networks: The International Journal of Computer and Telecommunications Networking
Review article: Synergizing reinforcement learning and game theory-A new direction for control

Applied Soft Computing
How routine learners can support family coordination

Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Game-theoretic agent programming in Golog under partial observability

KI'06 Proceedings of the 29th annual German conference on Artificial intelligence
Asimovian multiagents: applying laws of robotics to teams of humans and agents

ProMAS'06 Proceedings of the 4th international conference on Programming multi-agent systems
A statistical reasoning system for medication prompting

UbiComp '07 Proceedings of the 9th international conference on Ubiquitous computing
Bayesian update of dialogue state: A POMDP framework for spoken dialogue systems

Computer Speech and Language
Visual search for an object in a 3D environment using a mobile robot

Computer Vision and Image Understanding
Reinforcement learning for cooperative actions in a partially observable multi-agent system

ICANN'07 Proceedings of the 17th international conference on Artificial neural networks
Deterministic POMDPs revisited

UAI '09 Proceedings of the Twenty-Fifth Conference on Uncertainty in Artificial Intelligence
Unifying perceptual and behavioral learning with a correlative subspace learning rule

Neurocomputing
Planning to see: A hierarchical approach to planning visual actions on a robot using POMDPs

Artificial Intelligence
Empirical analysis of an on-line adaptive system using a mixture of Bayesian networks

Information Sciences: an International Journal
Planning under Uncertainty for Robotic Tasks with Mixed Observability

International Journal of Robotics Research
Quasi deterministic POMDPs and DecPOMDPs

Proceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems: volume 1 - Volume 1
Constraint-Based Controller Synthesis in Non-Deterministic and Partially Observable Domains

Proceedings of the 2010 conference on ECAI 2010: 19th European Conference on Artificial Intelligence
An investigation into mathematical programming for finite horizon decentralized POMDPs

Journal of Artificial Intelligence Research
A heuristic variable grid solution method for POMDPs

AAAI'97/IAAI'97 Proceedings of the fourteenth national conference on artificial intelligence and ninth conference on Innovative applications of artificial intelligence
Planning interventions in biological networks

ACM Transactions on Intelligent Systems and Technology (TIST)
Human-aware task planning: An application to mobile robots

ACM Transactions on Intelligent Systems and Technology (TIST)
Approximation algorithms for restless bandit problems

Journal of the ACM (JACM)
A prototype for a conversational companion for reminiscing about images

Computer Speech and Language
Multi-policy optimization in self-organizing systems

SOAR'09 Proceedings of the First international conference on Self-organizing architectures
Bayesian reasoning for software testing

Proceedings of the FSE/SDP workshop on Future of software engineering research
Hidden Markov model for human decision process in a partially observable environment

ICANN'10 Proceedings of the 20th international conference on Artificial neural networks: Part II
Planning in partially-observable switching-mode continuous domains

Annals of Mathematics and Artificial Intelligence
Value directed learning of gestures and facial displays

CVPR'04 Proceedings of the 2004 IEEE computer society conference on Computer vision and pattern recognition
Paradoxes in Learning and the Marginal Value of Information

Decision Analysis
POMDP solving: what rewards do you really expect at execution?

Proceedings of the 2010 conference on STAIRS 2010: Proceedings of the Fifth Starting AI Researchers' Symposium
BigList: speech-based selection of items from huge lists

SSIP '09/MIV'09 Proceedings of the 9th WSEAS international conference on signal, speech and image processing, and 9th WSEAS international conference on Multimedia, internet & video technologies
Moral minds as multiple-layer organizations

IBERAMIA'10 Proceedings of the 12th Ibero-American conference on Advances in artificial intelligence
A behavior adaptation method for an elderly companion robot: Rui

ICSR'10 Proceedings of the Second international conference on Social robotics
Natural actor and belief critic: Reinforcement algorithm for learning parameters of dialogue systems modelled as POMDPs

ACM Transactions on Speech and Language Processing (TSLP)
Nonverbal acoustic communication in human-computer interaction

Artificial Intelligence Review
Integration of reinforcement learning and optimal decision-making theories of the basal ganglia

Neural Computation
Dynamic selling of quality-graded products under demand uncertainties

Computers and Industrial Engineering
Decentralized MDPs with sparse interactions

Artificial Intelligence
LQG-MP: Optimized path planning for robots with motion uncertainty and imperfect state information

International Journal of Robotics Research
Symbolic model checking of probabilistic knowledge

Proceedings of the 13th Conference on Theoretical Aspects of Rationality and Knowledge
Evolving policies for multi-reward partially observable markov decision processes (MR-POMDPs)

Proceedings of the 13th annual conference on Genetic and evolutionary computation
Approximating n-player behavioural strategy nash equilibria using coevolution

Proceedings of the 13th annual conference on Genetic and evolutionary computation
Multi-reward policies for medical applications: anthrax attacks and smart wheelchairs

Proceedings of the 13th annual conference companion on Genetic and evolutionary computation
A Monte-Carlo AIXI approximation

Journal of Artificial Intelligence Research
Inverse Reinforcement Learning in Partially Observable Environments

The Journal of Machine Learning Research
A Bayesian Approach for Learning and Planning in Partially Observable Markov Decision Processes

The Journal of Machine Learning Research
Two decades of multiagent teamwork research: past, present, and future

CARE@AI'09/CARE@IAT'10 Proceedings of the CARE@AI 2009 and CARE@IAT 2010 international conference on Collaborative agents - research and development
Goal-oriented sensor selection for intelligent phones: (GOSSIP)

Proceedings of the 2011 international workshop on Situation activity & goal awareness
A geometric approach to find nondominated policies to imprecise reward MDPs

ECML PKDD'11 Proceedings of the 2011 European conference on Machine learning and knowledge discovery in databases - Volume Part I
Teamwork in distributed POMDPs: execution-time coordination under model uncertainty

The 10th International Conference on Autonomous Agents and Multiagent Systems - Volume 3
Agent sensing with stateful resources

The 10th International Conference on Autonomous Agents and Multiagent Systems - Volume 3
Robust grasping under object pose uncertainty

Autonomous Robots
Modeling agents and agent systems

Transactions on computational collective intelligence V
Aircraft Collision Avoidance Using Monte Carlo Real-Time Belief Space Search

Journal of Intelligent and Robotic Systems
A semantic model for actions and events in ambient intelligence

Engineering Applications of Artificial Intelligence
Learning to act optimally in partially observable Markov decision processes using hybrid probabilistic logic programs

SUM'11 Proceedings of the 5th international conference on Scalable uncertainty management
On the power of global reward signals in reinforcement learning

MATES'11 Proceedings of the 9th German conference on Multiagent system technologies
Towards Addressing Model Uncertainty: Robust Execution-Time Coordination for Teamwork

WI-IAT '11 Proceedings of the 2011 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology - Volume 02
End-to-end transmission control by modeling uncertainty about the network state

Proceedings of the 10th ACM Workshop on Hot Topics in Networks
Solving POMDPs by searching the space of finite policies

UAI'99 Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence
Learning finite-state controllers for partially observable environments

UAI'99 Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence
Qualitative MDPs and POMDPs: an order-of-magnitude approximation

UAI'02 Proceedings of the Eighteenth conference on Uncertainty in artificial intelligence
Particle filters in robotics

UAI'02 Proceedings of the Eighteenth conference on Uncertainty in artificial intelligence
The complexity of decentralized control of Markov decision processes

UAI'00 Proceedings of the Sixteenth conference on Uncertainty in artificial intelligence
Planning and acting under uncertainty: a new model for spoken dialogue systems

UAI'01 Proceedings of the Seventeenth conference on Uncertainty in artificial intelligence
Game-theoretic reasoning about actions in nonmonotonic causal theories

LPNMR'05 Proceedings of the 8th international conference on Logic Programming and Nonmonotonic Reasoning
Regulating social exchanges between personality-based non-transparent agents

MICAI'06 Proceedings of the 5th Mexican international conference on Artificial Intelligence
Probabilistic generalization of simple grammars and its application to reinforcement learning

ALT'06 Proceedings of the 17th international conference on Algorithmic Learning Theory
Survey of Motion Planning Literature in the Presence of Uncertainty: Considerations for UAV Guidance

Journal of Intelligent and Robotic Systems
Probabilistic reasoning about actions in nonmonotonic causal theories

UAI'03 Proceedings of the Nineteenth conference on Uncertainty in Artificial Intelligence
Optimal limited contingency planning

UAI'03 Proceedings of the Nineteenth conference on Uncertainty in Artificial Intelligence
Policy-contingent abstraction for robust robot control

UAI'03 Proceedings of the Nineteenth conference on Uncertainty in Artificial Intelligence
Online stochastic and robust optimization

ASIAN'04 Proceedings of the 9th Asian Computing Science conference on Advances in Computer Science: dedicated to Jean-Louis Lassez on the Occasion of His 5th Cycle Birthday
Using Markov Decision Processes to define an adaptive strategy to control the spread of an animal disease

Computers and Electronics in Agriculture
Learning by knowledge sharing in autonomous intelligent systems

IBERAMIA-SBIA'06 Proceedings of the 2nd international joint conference, and Proceedings of the 10th Ibero-American Conference on AI 18th Brazilian conference on Advances in Artificial Intelligence
Modeling agents based on aspiration adaptation theory

Autonomous Agents and Multi-Agent Systems
Sequentially optimal repeated coalition formation under uncertainty

Autonomous Agents and Multi-Agent Systems
Feature extraction for decision-theoretic planning in partially observable environments

ICANN'06 Proceedings of the 16th international conference on Artificial Neural Networks - Volume Part I
Coordinating teams in uncertain environments: a hybrid BDI-POMDP approach

ProMAS'04 Proceedings of the Second international conference on Programming Multi-Agent Systems
An optimal best-first search algorithm for solving infinite horizon DEC-POMDPs

ECML'05 Proceedings of the 16th European conference on Machine Learning
Active learning in partially observable markov decision processes

ECML'05 Proceedings of the 16th European conference on Machine Learning
Interval-based markov decision processes for regulating interactions between two agents in multi-agent systems

PARA'04 Proceedings of the 7th international conference on Applied Parallel Computing: state of the Art in Scientific Computing
Exploiting symmetries for single- and multi-agent Partially Observable Stochastic Domains

Artificial Intelligence
A POMDP model for guiding taxi cruising in a congested urban city

MICAI'11 Proceedings of the 10th Mexican international conference on Advances in Artificial Intelligence - Volume Part I
Exchange values and self-regulation of exchanges in multi-agent systems: the provisory, centralized model

ESOA'05 Proceedings of the Third international conference on Engineering Self-Organising Systems
A convergent multiagent reinforcement learning approach for a subclass of cooperative stochastic games

ALA'11 Proceedings of the 11th international conference on Adaptive and Learning Agents
Bayesian policy search with policy priors

IJCAI'11 Proceedings of the Twenty-Second international joint conference on Artificial Intelligence - Volume Volume Two
Planning under partial observability by classical replanning: theory and experiments

IJCAI'11 Proceedings of the Twenty-Second international joint conference on Artificial Intelligence - Volume Volume Three
Point-based value iteration for constrained POMDPs

IJCAI'11 Proceedings of the Twenty-Second international joint conference on Artificial Intelligence - Volume Volume Three
Goal recognition over POMDPs: inferring the intention of a POMDP agent

IJCAI'11 Proceedings of the Twenty-Second international joint conference on Artificial Intelligence - Volume Volume Three
Operationalizing situated cognition and learning

Cognitive Systems Research
Learning where to attend with deep architectures for image tracking

Neural Computation
Human-aware planning for robots embedded in ambient ecologies

Pervasive and Mobile Computing
Bisimulation Metrics for Continuous Markov Decision Processes

SIAM Journal on Computing
Feature reinforcement learning in practice

EWRL'11 Proceedings of the 9th European conference on Recent Advances in Reinforcement Learning
Active visual sensing and collaboration on mobile robots using hierarchical POMDPs

Proceedings of the 11th International Conference on Autonomous Agents and Multiagent Systems - Volume 1
Heuristic search of multiagent influence space

Proceedings of the 11th International Conference on Autonomous Agents and Multiagent Systems - Volume 2
Generalized and bounded policy iteration for finitely-nested interactive POMDPs: scaling up

Proceedings of the 11th International Conference on Autonomous Agents and Multiagent Systems - Volume 2
Evaluating POMDP rewards for active perception

Proceedings of the 11th International Conference on Autonomous Agents and Multiagent Systems - Volume 3
Tree-based pruning for multiagent POMDPs with delayed communication

Proceedings of the 11th International Conference on Autonomous Agents and Multiagent Systems - Volume 3
Toward automatic verification of multiagent systems for training simulations

ITS'12 Proceedings of the 11th international conference on Intelligent Tutoring Systems
A Kantorovich-Monadic Powerdomain for Information Hiding, with Probability and Nondeterminism

LICS '12 Proceedings of the 2012 27th Annual IEEE/ACM Symposium on Logic in Computer Science
A Recursive Classifier System for Partially Observable Environments

Fundamenta Informaticae
A New Architecture for Learning Classifier Systems to Solve POMDP Problems

Fundamenta Informaticae
Motion planning under uncertainty using iterative local optimization in belief space

International Journal of Robotics Research
Probabilistic Complex Actions in GOLOG

Fundamenta Informaticae - The 1st International Workshop on Knowledge Representation and Approximate Reasoning (KR&AR)
On the Computational Complexity of Stochastic Controller Optimization in POMDPs

ACM Transactions on Computation Theory (TOCT)
Sequential Action and Beliefs Under Partially Observable DSGE Environments

Computational Economics
Exploiting model equivalences for solving interactive dynamic influence diagrams

Journal of Artificial Intelligence Research
People, sensors, decisions: Customizable and adaptive technologies for assistance in healthcare

ACM Transactions on Interactive Intelligent Systems (TiiS) - Special issue on highlights of the decade in interactive intelligent systems
Searching with partial belief states in general games with incomplete information

KI'12 Proceedings of the 35th Annual German conference on Advances in Artificial Intelligence
Recognizing internal states of other agents to anticipate and coordinate interactions

EUMAS'11 Proceedings of the 9th European conference on Multi-Agent Systems
Observer effect from stateful resources in agent sensing

Autonomous Agents and Multi-Agent Systems
Tractable POMDP representations for intelligent tutoring systems

ACM Transactions on Intelligent Systems and Technology (TIST) - Special section on agent communication, trust in multiagent systems, intelligent tutoring and coaching systems
A conformant planner based on approximation: CpA(H)

ACM Transactions on Intelligent Systems and Technology (TIST) - Special section on agent communication, trust in multiagent systems, intelligent tutoring and coaching systems
Solving decentralized POMDP problems using genetic algorithms

Autonomous Agents and Multi-Agent Systems
A survey of point-based POMDP solvers

Autonomous Agents and Multi-Agent Systems
Optimal Decision Stimuli for Risky Choice Experiments: An Adaptive Approach

Management Science
The duality of state and observation in probabilistic transition systems

TbiLLC'11 Proceedings of the 9th international conference on Logic, Language, and Computation
Abstraction in Model Based Partially Observable Reinforcement Learning Using Extended Sequence Trees

WI-IAT '12 Proceedings of the The 2012 IEEE/WIC/ACM International Joint Conferences on Web Intelligence and Intelligent Agent Technology - Volume 02
A bayesian approach for constrained multi-agent minimum time search in uncertain dynamic domains

Proceedings of the 15th annual conference on Genetic and evolutionary computation
Shortest stochastic path with risk sensitive evaluation

MICAI'12 Proceedings of the 11th Mexican international conference on Advances in Artificial Intelligence - Volume Part I
Testing probabilistic equivalence through Reinforcement Learning

Information and Computation
Bayesian interaction shaping: learning to influence strategic interactions in mixed robotic domains

Proceedings of the 2013 international conference on Autonomous agents and multi-agent systems
Producing efficient error-bounded solutions for transition independent decentralized mdps

Proceedings of the 2013 international conference on Autonomous agents and multi-agent systems
Light at the end of the tunnel: a Monte Carlo approach to computing value of information

Proceedings of the 2013 international conference on Autonomous agents and multi-agent systems
Potential-based reward shaping for POMDPs

Proceedings of the 2013 international conference on Autonomous agents and multi-agent systems
Active sensing in complex multiagent environments

Proceedings of the 2013 international conference on Autonomous agents and multi-agent systems
Adaptive collective routing using gaussian process dynamic congestion models

Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining
Social signal and user adaptation in reinforcement learning-based dialogue management

Proceedings of the 2nd Workshop on Machine Learning for Interactive Systems: Bridging the Gap Between Perception, Action and Communication
KnowRob: A knowledge processing infrastructure for cognition-enabled robots

International Journal of Robotics Research
Decentralized multi-robot cooperation with auctioned POMDPs

International Journal of Robotics Research
Linear fitted-Q iteration with multiple reward functions

The Journal of Machine Learning Research
Planning for multiple measurement channels in a continuous-state POMDP

Annals of Mathematics and Artificial Intelligence
Incremental clustering and expansion for faster optimal planning in decentralized POMDPs

Journal of Artificial Intelligence Research
SocialWeaver: collaborative inference of human conversation networks using smartphones

Proceedings of the 11th ACM Conference on Embedded Networked Sensor Systems
Sufficient plan-time statistics for decentralized POMDPs

IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence
Bimodal switching for online planning in multiagent settings

IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence
Online expectation maximization for reinforcement learning in POMDPs

IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence
Look versus leap: computing value of information with high-dimensional streaming evidence

IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence
Extending sensorimotor contingency theory: prediction, planning, and action generation

Adaptive Behavior - Animals, Animats, Software Agents, Robots, Adaptive Systems
WrightEagle and UT Austin villa: RoboCup 2011 simulation league champions

Robot Soccer World Cup XV
Construction of approximation spaces for reinforcement learning

The Journal of Machine Learning Research
Gaussian Processes for POMDP-Based Dialogue Manager Optimization

IEEE/ACM Transactions on Audio, Speech and Language Processing (TASLP)
MineralMiner: An active sensing simulation environment

Multiagent and Grid Systems
A survey of multi-objective sequential decision-making

Journal of Artificial Intelligence Research
Point-based online value iteration algorithm in large POMDP

Applied Intelligence
Artificial Intelligence: From programs to solvers

AI Communications - ECAI 2012 Turing and Anniversary Track
HANA: A Human-Aware Negotiation Architecture

Decision Support Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper, we bring techniques from operations research to bear on the problem of choosing optimal actions in partially observable stochastic domains. We begin by introducing the theory of Markov decision processes (mdps) and partially observable MDPs (pomdps). We then outline a novel algorithm for solving pomdps off line and show how, in some cases, a finite-memory controller can be extracted from the solution to a POMDP. We conclude with a discussion of how our approach relates to previous work, the complexity of finding exact solutions to pomdps, and of some possibilities for finding approximate solutions.