Do the right thing: studies in limited rationality
Do the right thing: studies in limited rationality
Learning to Perceive and Act by Trial and Error
Machine Learning
Elements of information theory
Elements of information theory
Technical Note: \cal Q-Learning
Machine Learning
Learning in embedded systems
Neural Networks for Pattern Recognition
Neural Networks for Pattern Recognition
Dynamic Programming
Handbook of Mathematical Functions, With Formulas, Graphs, and Mathematical Tables,
Handbook of Mathematical Functions, With Formulas, Graphs, and Mathematical Tables,
Reinforcement learning: a survey
Journal of Artificial Intelligence Research
A motivational system that drives the development of activity
Proceedings of the first international joint conference on Autonomous agents and multiagent systems: part 1
Bias and variance in value function estimation
ICML '04 Proceedings of the twenty-first international conference on Machine learning
Reinforcement learning with Gaussian processes
ICML '05 Proceedings of the 22nd international conference on Machine learning
An analytic solution to discrete Bayesian reinforcement learning
ICML '06 Proceedings of the 23rd international conference on Machine learning
A Bayesian approach to learning classifier systems in uncertain environments
Proceedings of the 8th annual conference on Genetic and evolutionary computation
Multi-task reinforcement learning: a hierarchical Bayesian approach
Proceedings of the 24th international conference on Machine learning
Proceedings of the 7th international joint conference on Autonomous agents and multiagent systems - Volume 2
Sequential decision making with untrustworthy service providers
Proceedings of the 7th international joint conference on Autonomous agents and multiagent systems - Volume 2
Sequential decision making in repeated coalition formation under uncertainty
Proceedings of the 7th international joint conference on Autonomous agents and multiagent systems - Volume 1
An Empirical Analysis of the Impact of Prioritised Sweeping on the DynaQ's Performance
ICAISC '08 Proceedings of the 9th international conference on Artificial Intelligence and Soft Computing
Recent Advances in Reinforcement Learning
An Information-Theoretic Class of Stochastic Decision Processes
WI-IAT '08 Proceedings of the 2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology - Volume 02
On universal transfer learning
Theoretical Computer Science
Model-free reinforcement learning as mixture learning
ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
AAAI'07 Proceedings of the 22nd national conference on Artificial intelligence - Volume 1
Bayesian real-time dynamic programming
IJCAI'09 Proceedings of the 21st international jont conference on Artifical intelligence
MICAI'07 Proceedings of the artificial intelligence 6th Mexican international conference on Advances in artificial intelligence
Bayesian role discovery for multi-agent reinforcement learning
Proceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems: volume 1 - Volume 1
Uncertainty Propagation for Efficient Exploration in Reinforcement Learning
Proceedings of the 2010 conference on ECAI 2010: 19th European Conference on Artificial Intelligence
A framework for building intelligent SLA negotiation strategies under time constraints
GECON'10 Proceedings of the 7th international conference on Economics of grids, clouds, systems, and services
A minimum relative entropy principle for learning and acting
Journal of Artificial Intelligence Research
Learning the behavior model of a robot
Autonomous Robots
Solving non-stationary bandit problems by random sampling from sibling Kalman filters
IEA/AIE'10 Proceedings of the 23rd international conference on Industrial engineering and other applications of applied intelligent systems - Volume Part III
Journal of Artificial Intelligence Research
Information Collection on a Graph
Operations Research
A Bayesian Approach for Learning and Planning in Partially Observable Markov Decision Processes
The Journal of Machine Learning Research
Lagrange dual decomposition for finite horizon Markov decision processes
ECML PKDD'11 Proceedings of the 2011 European conference on Machine learning and knowledge discovery in databases - Volume Part I
Model based Bayesian exploration
UAI'99 Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence
Sequentially optimal repeated coalition formation under uncertainty
Autonomous Agents and Multi-Agent Systems
Nearly optimal exploration-exploitation decision thresholds
ICANN'06 Proceedings of the 16th international conference on Artificial Neural Networks - Volume Part I
Teamwork and simulation in hybrid cognitive architecture
KES'06 Proceedings of the 10th international conference on Knowledge-Based Intelligent Information and Engineering Systems - Volume Part II
Leveraging domain knowledge to learn normative behavior: a bayesian approach
ALA'11 Proceedings of the 11th international conference on Adaptive and Learning Agents
A time-constrained SLA negotiation strategy in competitive computational grids
Future Generation Computer Systems
Robust bayesian reinforcement learning through tight lower bounds
EWRL'11 Proceedings of the 9th European conference on Recent Advances in Reinforcement Learning
Bayesian multitask inverse reinforcement learning
EWRL'11 Proceedings of the 9th European conference on Recent Advances in Reinforcement Learning
Decentralized Bayesian reinforcement learning for online agent collaboration
Proceedings of the 11th International Conference on Autonomous Agents and Multiagent Systems - Volume 1
Optimal learning of transition probabilities in the two-agent newsvendor problem
Proceedings of the Winter Simulation Conference
Learning Communication in Interactive Dynamic Influence Diagrams
WI-IAT '12 Proceedings of the The 2012 IEEE/WIC/ACM International Joint Conferences on Web Intelligence and Intelligent Agent Technology - Volume 02
Variable risk control via stochastic optimization
International Journal of Robotics Research
Efficient learning in linearly solvable MDP models
IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence
IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence
Monte-Carlo tree search for Bayesian reinforcement learning
Applied Intelligence
Scalable and efficient bayes-adaptive reinforcement learning based on monte-carlo tree search
Journal of Artificial Intelligence Research
Hi-index | 0.00 |
A central problem in learning in complex environments is balancing exploration of untested actions against exploitation of actions that are known to be good. The benefit of exploration can be estimated using the classical notion of Value of Information-the expected improvement in future decision quality that might arise from the information acquired by exploration. Estimating this quantity requires an assessment of the agent's uncertainty about its current value estimates for states. In this paper, we adopt a Bayesian approach to maintaining this uncertain information. We extend Watkins' Q-learning by maintaining and propagating probability distributions over the Q-values. These distributions are used to compute a myopic approximation to the value of information for each action and hence to select the action that best balances exploration and exploitation. We establish the convergence properties of our algorithm and show experimentally that it can exhibit substantial improvements over other well-known model-free exploration strategies.