Near-Optimal Reinforcement Learning in Polynominal Time

Authors:
Michael J. Kearns;Satinder P. Singh
Affiliations:
-;-
Venue:
ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Year:
1998

Citing 0
Cited 55

Efficient exploration for optimizing immediate reward

AAAI '99/IAAI '99 Proceedings of the sixteenth national conference on Artificial intelligence and the eleventh Innovative applications of artificial intelligence conference innovative applications of artificial intelligence
Control of exploitation-exploration meta-parameter in reinforcement learning

Neural Networks - Computational models of neuromodulation
Characterizing Markov Decision Processes

ECML '02 Proceedings of the 13th European Conference on Machine Learning
Faster Near-Optimal Reinforcement Learning: Adding Adaptiveness to the E3 Algorithm

ALT '99 Proceedings of the 10th International Conference on Algorithmic Learning Theory
Self-Optimizing and Pareto-Optimal Policies in General Environments Based on Bayes-Mixtures

COLT '02 Proceedings of the 15th Annual Conference on Computational Learning Theory
PAC Bounds for Multi-armed Bandit and Markov Decision Processes

COLT '02 Proceedings of the 15th Annual Conference on Computational Learning Theory
Sequential Sampling Techniques for Algorithmic Learning Theory

ALT '00 Proceedings of the 11th International Conference on Algorithmic Learning Theory
Polynomial-time reinforcement learning of near-optimal policies

Eighteenth national conference on Artificial intelligence
R-max - a general polynomial time algorithm for near-optimal reinforcement learning

The Journal of Machine Learning Research
A Geometric Approach to Multi-Criterion Reinforcement Learning

The Journal of Machine Learning Research
Using relative novelty to identify useful temporal abstractions in reinforcement learning

ICML '04 Proceedings of the twenty-first international conference on Machine learning
Efficient learning equilibrium

Artificial Intelligence
Reliability of internal prediction/estimation and its application: I. adaptive action selection reflecting reliability of value function

Neural Networks
Efficient learning of multi-step best response

Proceedings of the fourth international joint conference on Autonomous agents and multiagent systems
Hedged learning: regret-minimization with learning experts

ICML '05 Proceedings of the 22nd international conference on Machine learning
Relating reinforcement learning performance to classification performance

ICML '05 Proceedings of the 22nd international conference on Machine learning
Bayesian sparse sampling for on-line reward optimization

ICML '05 Proceedings of the 22nd international conference on Machine learning
An intrinsic reward mechanism for efficient exploration

ICML '06 Proceedings of the 23rd international conference on Machine learning
Sequential sampling techniques for algorithmic learning theory

Theoretical Computer Science - Algorithmic learning theory (ALT 2000)
If multi-agent learning is the answer, what is the question?

Artificial Intelligence
No regrets about no-regret

Artificial Intelligence
Percentile optimization in uncertain Markov decision processes with application to efficient exploration

Proceedings of the 24th international conference on Machine learning
Guiding exploration by pre-existing knowledge without modifying reward

Neural Networks
Model-based function approximation in reinforcement learning

Proceedings of the 6th international joint conference on Autonomous agents and multiagent systems
Hierarchical model-based reinforcement learning: R-max + MAXQ

Proceedings of the 25th international conference on Machine learning
The many faces of optimism: a unifying approach

Proceedings of the 25th international conference on Machine learning
The utility of temporal abstraction in reinforcement learning

Proceedings of the 7th international joint conference on Autonomous agents and multiagent systems - Volume 1
The permutable POMDP: fast solutions to POMDPs for preference elicitation

Proceedings of the 7th international joint conference on Autonomous agents and multiagent systems - Volume 1
Optimistic initialization and greediness lead to polynomial time learning in factored MDPs

ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Compositional Models for Reinforcement Learning

ECML PKDD '09 Proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases: Part I
A reinforcement learning algorithm with polynomial interaction complexity for only-costly-observable MDPs

AAAI'07 Proceedings of the 22nd national conference on Artificial intelligence - Volume 1
Learning to Coordinate Efficiently: a model-based approach

Journal of Artificial Intelligence Research
Accelerating reinforcement learning through implicit imitation

Journal of Artificial Intelligence Research
Integrating learning from examples into the search for diagnostic policies

Journal of Artificial Intelligence Research
A near-optimal poly-time algorithm for learning in a class of stochastic games

IJCAI'99 Proceedings of the 16th international joint conference on Artificial intelligence - Volume 2
Efficient reinforcement learning in factored MDPs

IJCAI'99 Proceedings of the 16th international joint conference on Artificial intelligence - Volume 2
Using linear programming for Bayesian exploration in Markov decision processes

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
R-MAX: a general polynomial time algorithm for near-optimal reinforcement learning

IJCAI'01 Proceedings of the 17th international joint conference on Artificial intelligence - Volume 2
Reinforcement learning in POMDPs without resets

IJCAI'05 Proceedings of the 19th international joint conference on Artificial intelligence
Transfer Learning for Reinforcement Learning Domains: A Survey

The Journal of Machine Learning Research
Model-based exploration in continuous state spaces

SARA'07 Proceedings of the 7th International conference on Abstraction, reformulation, and approximation
Efficient exploration through active learning for value function approximation in reinforcement learning

Neural Networks
Reward-modulated hebbian learning of decision making

Neural Computation
Universal reinforcement learning

IEEE Transactions on Information Theory
Uncertainty Propagation for Efficient Exploration in Reinforcement Learning

Proceedings of the 2010 conference on ECAI 2010: 19th European Conference on Artificial Intelligence
Learning the behavior model of a robot

Autonomous Robots
A Bayesian Approach for Learning and Planning in Partially Observable Markov Decision Processes

The Journal of Machine Learning Research
SD-Q: selective discount Q learning based on new results of intertemporal choice theory

AICI'11 Proceedings of the Third international conference on Artificial intelligence and computational intelligence - Volume Part II
Model based Bayesian exploration

UAI'99 Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence
Towards finite-sample convergence of direct reinforcement learning

ECML'05 Proceedings of the 16th European conference on Machine Learning
Abstraction and generalization in reinforcement learning: a summary and framework

ALA'09 Proceedings of the Second international conference on Adaptive and Learning Agents
Robust bayesian reinforcement learning through tight lower bounds

EWRL'11 Proceedings of the 9th European conference on Recent Advances in Reinforcement Learning
Optimistic agents are asymptotically optimal

AI'12 Proceedings of the 25th Australasian joint conference on Advances in Artificial Intelligence
Cooperating with a markovian ad hoc teammate

Proceedings of the 2013 international conference on Autonomous agents and multi-agent systems
Multiagent learning in the presence of memory-bounded agents

Autonomous Agents and Multi-Agent Systems

Quantified Score

Hi-index	0.06

Near-Optimal Reinforcement Learning in Polynominal Time

Quantified Score

Visualization

Abstract