Proceedings of the seventh international conference (1990) on Machine learning
Learning in embedded systems
Model-based average reward reinforcement learning
Artificial Intelligence
A near-optimal polynomial time algorithm for learning in certain classes of stochastic games
Artificial Intelligence
Introduction to Reinforcement Learning
Introduction to Reinforcement Learning
Multiagent Reinforcement Learning: Theoretical Framework and an Algorithm
ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Near-Optimal Reinforcement Learning in Polynominal Time
ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Efficient Reinforcement Learning in Factored MDPs
IJCAI '99 Proceedings of the Sixteenth International Joint Conference on Artificial Intelligence
A Generalized Reinforcement-Learning Model: Convergence and Applications
A Generalized Reinforcement-Learning Model: Convergence and Applications
Reinforcement learning: a survey
Journal of Artificial Intelligence Research
Dynamic non-Bayesian decision making
Journal of Artificial Intelligence Research
A Geometric Approach to Multi-Criterion Reinforcement Learning
The Journal of Machine Learning Research
Using relative novelty to identify useful temporal abstractions in reinforcement learning
ICML '04 Proceedings of the twenty-first international conference on Machine learning
Efficient learning equilibrium
Artificial Intelligence
Efficient learning of multi-step best response
Proceedings of the fourth international joint conference on Autonomous agents and multiagent systems
Exploration and apprenticeship learning in reinforcement learning
ICML '05 Proceedings of the 22nd international conference on Machine learning
A theoretical analysis of Model-Based Interval Estimation
ICML '05 Proceedings of the 22nd international conference on Machine learning
PAC model-free reinforcement learning
ICML '06 Proceedings of the 23rd international conference on Machine learning
A hierarchical approach to efficient reinforcement learning in deterministic domains
AAMAS '06 Proceedings of the fifth international joint conference on Autonomous agents and multiagent systems
If multi-agent learning is the answer, what is the question?
Artificial Intelligence
Perspectives on multiagent learning
Artificial Intelligence
Proceedings of the 24th international conference on Machine learning
Generalized multiagent learning with performance bound
Autonomous Agents and Multi-Agent Systems
Proceedings of the 25th international conference on Machine learning
Hierarchical model-based reinforcement learning: R-max + MAXQ
Proceedings of the 25th international conference on Machine learning
Knows what it knows: a framework for self-aware learning
Proceedings of the 25th international conference on Machine learning
The utility of temporal abstraction in reinforcement learning
Proceedings of the 7th international joint conference on Autonomous agents and multiagent systems - Volume 1
Expediting RL by using graphical structures
Proceedings of the 7th international joint conference on Autonomous agents and multiagent systems - Volume 3
Planning and Learning in Environments with Delayed Feedback
ECML '07 Proceedings of the 18th European conference on Machine Learning
Online Multiagent Learning against Memory Bounded Adversaries
ECML PKDD '08 Proceedings of the 2008 European Conference on Machine Learning and Knowledge Discovery in Databases - Part I
Transferring Instances for Model-Based Reinforcement Learning
ECML PKDD '08 Proceedings of the European conference on Machine Learning and Knowledge Discovery in Databases - Part II
An analysis of model-based Interval Estimation for Markov Decision Processes
Journal of Computer and System Sciences
Optimism in the Face of Uncertainty Should be Refutable
Minds and Machines
Efficient Reinforcement Learning in Parameterized Models: Discrete Parameter Case
Recent Advances in Reinforcement Learning
Markov Decision Processes with Arbitrary Reward Processes
Recent Advances in Reinforcement Learning
Learning and planning in environments with delayed feedback
Autonomous Agents and Multi-Agent Systems
ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Near-Bayesian exploration in polynomial time
ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Learning equilibria in repeated congestion games
Proceedings of The 8th International Conference on Autonomous Agents and Multiagent Systems - Volume 1
Online exploration in least-squares policy iteration
Proceedings of The 8th International Conference on Autonomous Agents and Multiagent Systems - Volume 2
As Safe As It Gets: Near-Optimal Learning in Multi-Stage Games with Imperfect Monitoring
Proceedings of the 2008 conference on ECAI 2008: 18th European Conference on Artificial Intelligence
Performance bounded reinforcement learning in strategic interactions
AAAI'04 Proceedings of the 19th national conference on Artifical intelligence
Dynamic programming for partially observable stochastic games
AAAI'04 Proceedings of the 19th national conference on Artifical intelligence
Markov Decision Processes with Arbitrary Reward Processes
Mathematics of Operations Research
Compositional Models for Reinforcement Learning
ECML PKDD '09 Proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases: Part I
Optimal efficient learning equilibrium: imperfect monitoring in symmetric games
AAAI'05 Proceedings of the 20th national conference on Artificial intelligence - Volume 2
Learning equilibrium in resource selection games
AAAI'07 Proceedings of the 22nd national conference on Artificial intelligence - Volume 1
AAAI'07 Proceedings of the 22nd national conference on Artificial intelligence - Volume 1
Efficient reinforcement learning with relocatable action models
AAAI'07 Proceedings of the 22nd national conference on Artificial intelligence - Volume 1
Efficient structure learning in factored-state MDPs
AAAI'07 Proceedings of the 22nd national conference on Artificial intelligence - Volume 1
Potential-based shaping in model-based reinforcement learning
AAAI'08 Proceedings of the 23rd national conference on Artificial intelligence - Volume 2
Learning to Coordinate Efficiently: a model-based approach
Journal of Artificial Intelligence Research
Online learning in Markov decision processes with arbitrarily changing rewards and transitions
GameNets'09 Proceedings of the First ICST international conference on Game Theory for Networks
Censored exploration and the dark pool problem
Communications of the ACM
Provably Efficient Learning with Typed Parametric Models
The Journal of Machine Learning Research
Reinforcement Learning in Finite MDPs: PAC Analysis
The Journal of Machine Learning Research
Bounded parameter Markov decision processes with average reward criterion
COLT'07 Proceedings of the 20th annual conference on Learning theory
Model-based exploration in continuous state spaces
SARA'07 Proceedings of the 7th International conference on Abstraction, reformulation, and approximation
MICAI'07 Proceedings of the artificial intelligence 6th Mexican international conference on Advances in artificial intelligence
A Bayesian sampling approach to exploration in reinforcement learning
UAI '09 Proceedings of the Twenty-Fifth Conference on Uncertainty in Artificial Intelligence
REGAL: a regularization based algorithm for reinforcement learning in weakly communicating MDPs
UAI '09 Proceedings of the Twenty-Fifth Conference on Uncertainty in Artificial Intelligence
Censored exploration and the Dark Pool Problem
UAI '09 Proceedings of the Twenty-Fifth Conference on Uncertainty in Artificial Intelligence
Exploring compact reinforcement-learning representations with linear regression
UAI '09 Proceedings of the Twenty-Fifth Conference on Uncertainty in Artificial Intelligence
Improving optimistic exploration in model-free reinforcement learning
ICANNGA'09 Proceedings of the 9th international conference on Adaptive and natural computing algorithms
Autonomous Agents and Multi-Agent Systems
PAC-MDP learning with knowledge-based admissible models
Proceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems: volume 1 - Volume 1
Model-based direct policy search
Proceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems: volume 1 - Volume 1
Near-optimal Regret Bounds for Reinforcement Learning
The Journal of Machine Learning Research
Uncertainty Propagation for Efficient Exploration in Reinforcement Learning
Proceedings of the 2010 conference on ECAI 2010: 19th European Conference on Artificial Intelligence
Adaptive ε-greedy exploration in reinforcement learning based on value differences
KI'10 Proceedings of the 33rd annual German conference on Advances in artificial intelligence
Gaussian processes for sample efficient reinforcement learning with RMAX-like exploration
ECML PKDD'10 Proceedings of the 2010 European conference on Machine learning and knowledge discovery in databases: Part I
Exploration in relational worlds
ECML PKDD'10 Proceedings of the 2010 European conference on Machine learning and knowledge discovery in databases: Part II
Reducing reinforcement learning to KWIK online regression
Annals of Mathematics and Artificial Intelligence
Empowerment for continuous agent-environment systems
Adaptive Behavior - Animals, Animats, Software Agents, Robots, Adaptive Systems
A Monte-Carlo AIXI approximation
Journal of Artificial Intelligence Research
A Bayesian Approach for Learning and Planning in Partially Observable Markov Decision Processes
The Journal of Machine Learning Research
Exploiting Best-Match Equations for Efficient Reinforcement Learning
The Journal of Machine Learning Research
The 10th International Conference on Autonomous Agents and Multiagent Systems - Volume 3
Models for autonomously motivated exploration in reinforcement learning
ALT'11 Proceedings of the 22nd international conference on Algorithmic learning theory
Hierarchical Knowledge Gradient for Sequential Sampling
The Journal of Machine Learning Research
Towards finite-sample convergence of direct reinforcement learning
ECML'05 Proceedings of the 16th European conference on Machine Learning
An assessment of strategies for choosing between competitive marketplaces
Electronic Commerce Research and Applications
Abstraction and generalization in reinforcement learning: a summary and framework
ALA'09 Proceedings of the Second international conference on Adaptive and Learning Agents
Statistical estimation with bounded memory
Statistics and Computing
EWRL'11 Proceedings of the 9th European conference on Recent Advances in Reinforcement Learning
Handling ambiguous effects in action learning
EWRL'11 Proceedings of the 9th European conference on Recent Advances in Reinforcement Learning
Feature reinforcement learning in practice
EWRL'11 Proceedings of the 9th European conference on Recent Advances in Reinforcement Learning
Robust bayesian reinforcement learning through tight lower bounds
EWRL'11 Proceedings of the 9th European conference on Recent Advances in Reinforcement Learning
Active malware analysis using stochastic games
Proceedings of the 11th International Conference on Autonomous Agents and Multiagent Systems - Volume 1
V-MAX: tempered optimism for better PAC reinforcement learning
Proceedings of the 11th International Conference on Autonomous Agents and Multiagent Systems - Volume 1
Just add Pepper: extending learning algorithms for repeated matrix games to repeated Markov games
Proceedings of the 11th International Conference on Autonomous Agents and Multiagent Systems - Volume 1
Integrating a partial model into model free reinforcement learning
The Journal of Machine Learning Research
Bayes-optimal reinforcement learning for discrete uncertainty domains
Proceedings of the 11th International Conference on Autonomous Agents and Multiagent Systems - Volume 3
ECML PKDD'12 Proceedings of the 2012 European conference on Machine Learning and Knowledge Discovery in Databases - Volume Part II
Observer effect from stateful resources in agent sensing
Autonomous Agents and Multi-Agent Systems
Smart exploration in reinforcement learning using absolute temporal difference errors
Proceedings of the 2013 international conference on Autonomous agents and multi-agent systems
Cooperating with a markovian ad hoc teammate
Proceedings of the 2013 international conference on Autonomous agents and multi-agent systems
Exploration in relational domains for model-based reinforcement learning
The Journal of Machine Learning Research
Reinforcement learning in robotics: A survey
International Journal of Robotics Research
Prior-free exploration bonus for and beyond near bayes-optimal behavior
IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence
Monte-Carlo tree search for Bayesian reinforcement learning
Applied Intelligence
Journal of Intelligent and Robotic Systems
MineralMiner: An active sensing simulation environment
Multiagent and Grid Systems
Scalable and efficient bayes-adaptive reinforcement learning based on monte-carlo tree search
Journal of Artificial Intelligence Research
Hi-index | 0.03 |
R-MAX is a very simple model-based reinforcement learning algorithm which can attain near-optimal average reward in polynomial time. In R-MAX, the agent always maintains a complete, but possibly inaccurate model of its environment and acts based on the optimal policy derived from this model. The model is initialized in an optimistic fashion: all actions in all states return the maximal possible reward (hence the name). During execution, it is updated based on the agent's observations. R-MAX improves upon several previous algorithms: (1) It is simpler and more general than Kearns and Singh's E3 algorithm, covering zero-sum stochastic games. (2) It has a built-in mechanism for resolving the exploration vs. exploitation dilemma. (3) It formally justifies the ``optimism under uncertainty'' bias used in many RL algorithms. (4) It is simpler, more general, and more efficient than Brafman and Tennenholtz's LSG algorithm for learning in single controller stochastic games. (5) It generalizes the algorithm by Monderer and Tennenholtz for learning in repeated games. (6) It is the only algorithm for learning in repeated games, to date, which is provably efficient, considerably improving and simplifying previous algorithms by Banos and by Megiddo.