Stochastic simulation
Proceedings of the seventh international conference (1990) on Machine learning
Do the right thing: studies in limited rationality
Do the right thing: studies in limited rationality
AAAI '98/IAAI '98 Proceedings of the fifteenth national/tenth conference on Artificial intelligence/Innovative applications of artificial intelligence
Generalized prioritized sweeping
NIPS '97 Proceedings of the 1997 conference on Advances in neural information processing systems 10
A tutorial on learning with Bayesian networks
Learning in graphical models
Efficient Bayesian parameter estimation in large discrete domains
Proceedings of the 1998 conference on Advances in neural information processing systems II
Neural Networks for Pattern Recognition
Neural Networks for Pattern Recognition
Using Learning for Approximation in Stochastic Processes
ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Near-Optimal Reinforcement Learning in Polynominal Time
ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Reinforcement learning: a survey
Journal of Artificial Intelligence Research
Stochastic simulation algorithms for dynamic probabilistic networks
UAI'95 Proceedings of the Eleventh conference on Uncertainty in artificial intelligence
Control of exploitation-exploration meta-parameter in reinforcement learning
Neural Networks - Computational models of neuromodulation
Characterizing Markov Decision Processes
ECML '02 Proceedings of the 13th European Conference on Machine Learning
Coordination in multiagent reinforcement learning: a Bayesian approach
AAMAS '03 Proceedings of the second international joint conference on Autonomous agents and multiagent systems
Bayesian Reinforcement Learning for Coalition Formation under Uncertainty
AAMAS '04 Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems - Volume 3
Bayesian sparse sampling for on-line reward optimization
ICML '05 Proceedings of the 22nd international conference on Machine learning
An analytic solution to discrete Bayesian reinforcement learning
ICML '06 Proceedings of the 23rd international conference on Machine learning
Proceedings of the 24th international conference on Machine learning
Multi-task reinforcement learning: a hierarchical Bayesian approach
Proceedings of the 24th international conference on Machine learning
Proceedings of the 25th international conference on Machine learning
Sequential decision making with untrustworthy service providers
Proceedings of the 7th international joint conference on Autonomous agents and multiagent systems - Volume 2
Sequential decision making in repeated coalition formation under uncertainty
Proceedings of the 7th international joint conference on Autonomous agents and multiagent systems - Volume 1
Imitation Learning Using Graphical Models
ECML '07 Proceedings of the 18th European conference on Machine Learning
Transferring Instances for Model-Based Reinforcement Learning
ECML PKDD '08 Proceedings of the European conference on Machine Learning and Knowledge Discovery in Databases - Part II
An online multi-agent co-operative learning algorithm in POMDPs
Journal of Experimental & Theoretical Artificial Intelligence
Spoken language interaction with model uncertainty: an adaptive human-robot interaction system
Connection Science - Language and Robots
A bayesian reinforcement learning approach for customizing human-robot interfaces
Proceedings of the 14th international conference on Intelligent user interfaces
Combining Cognitive with Computational Trust Reasoning
Trust in Agent Societies
Near-Bayesian exploration in polynomial time
ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Dynamic information source selection for intrusion detection systems
Proceedings of The 8th International Conference on Autonomous Agents and Multiagent Systems - Volume 2
Action selection in Bayesian reinforcement learning
AAAI'06 proceedings of the 21st national conference on Artificial intelligence - Volume 2
AAAI'07 Proceedings of the 22nd national conference on Artificial intelligence - Volume 1
Accelerating reinforcement learning through implicit imitation
Journal of Artificial Intelligence Research
Using linear programming for Bayesian exploration in Markov decision processes
IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
A Bayesian approach to imitation in reinforcement learning
IJCAI'03 Proceedings of the 18th international joint conference on Artificial intelligence
State abstraction discovery from irrelevant state variables
IJCAI'05 Proceedings of the 19th international joint conference on Artificial intelligence
Percentile Optimization for Markov Decision Processes with Parameter Uncertainty
Operations Research
Transfer Learning for Reinforcement Learning Domains: A Survey
The Journal of Machine Learning Research
Cognitive network management with reinforcement learning for wireless mesh networks
IPOM'07 Proceedings of the 7th IEEE international conference on IP operations and management
MICAI'07 Proceedings of the artificial intelligence 6th Mexican international conference on Advances in artificial intelligence
Uncertainty Propagation for Efficient Exploration in Reinforcement Learning
Proceedings of the 2010 conference on ECAI 2010: 19th European Conference on Artificial Intelligence
Self-organising sensors for wide area surveillance using the max-sum algorithm
SOAR'09 Proceedings of the First international conference on Self-organizing architectures
Smarter sampling in model-based Bayesian reinforcement learning
ECML PKDD'10 Proceedings of the 2010 European conference on Machine learning and knowledge discovery in databases: Part I
Incorporating domain models into Bayesian optimization for RL
ECML PKDD'10 Proceedings of the 2010 European conference on Machine learning and knowledge discovery in databases: Part III
A minimum relative entropy principle for learning and acting
Journal of Artificial Intelligence Research
A Bayesian Approach for Learning and Planning in Partially Observable Markov Decision Processes
The Journal of Machine Learning Research
Sequentially optimal repeated coalition formation under uncertainty
Autonomous Agents and Multi-Agent Systems
Active learning in partially observable markov decision processes
ECML'05 Proceedings of the 16th European conference on Machine Learning
Admission control policies for a multi-class QoS-aware service oriented architecture
ACM SIGMETRICS Performance Evaluation Review
Leveraging domain knowledge to learn normative behavior: a bayesian approach
ALA'11 Proceedings of the 11th international conference on Adaptive and Learning Agents
Robust bayesian reinforcement learning through tight lower bounds
EWRL'11 Proceedings of the 9th European conference on Recent Advances in Reinforcement Learning
Decentralized Bayesian reinforcement learning for online agent collaboration
Proceedings of the 11th International Conference on Autonomous Agents and Multiagent Systems - Volume 1
Optimal learning of transition probabilities in the two-agent newsvendor problem
Proceedings of the Winter Simulation Conference
You are what you consume: a bayesian method for personalized recommendations
Proceedings of the 7th ACM conference on Recommender systems
Lifelong learning for acquiring the wisdom of the crowd
IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence
Hi-index | 0.00 |
Reinforcement learning systems are often concerned with balancing exploration of untested actions against exploitation of actions that are known to be good. The benefit of exploration can be estimated using the classical notion of Value of Information - the expected improvement in future decision quality arising from the information acquired by exploration. Estimating this quantity requires an assessment of the agent's uncertainty about its current value estimates for states. In this paper we investigate ways to represent and reason about this uncertainty in algorithms where the system attempts to learn a model of its environment. We explicitly represent uncertainty about the parameters of the model and build probability distributions over Q-values based on these. These distributions are used to compute a myopic approximation to the value of information for each action and hence to select the action that best balances exploration and exploitation