Efficient Exploration In Reinforcement Learning

Authors:
Sebastian B. Thrun
Affiliations:
-
Venue:
Efficient Exploration In Reinforcement Learning
Year:
1992

Citing 0
Cited 44

Fast Online Q(λ)

Machine Learning
Value-update rules for real-time search

AAAI '99/IAAI '99 Proceedings of the sixteenth national conference on Artificial intelligence and the eleventh Innovative applications of artificial intelligence conference innovative applications of artificial intelligence
Exploration of Multi-State Environments: Local Measures and Back-Propagation of Uncertainty

Machine Learning
Terrain coverage with ant robots: a simulation study

Proceedings of the fifth international conference on Autonomous agents
Reinforcement learning for fuzzy agents: application to a pighouse environment control

New learning paradigms in soft computing
Efficiently searching a graph by a smell-oriented vertex process

Annals of Mathematics and Artificial Intelligence
Efficient and inefficient ant coverage methods

Annals of Mathematics and Artificial Intelligence
Analysis and Design of Robot's Behavior: Towards a Methodology

EWLR-6 Proceedings of the 6th European Workshop on Learning Robots
Application of Reinforcement Learning to Electrical Power System Closed-Loop Emergency Control

PKDD '00 Proceedings of the 4th European Conference on Principles of Data Mining and Knowledge Discovery
A context-based architecture for general problem solving

ICSAB Proceedings of the seventh international conference on simulation of adaptive behavior on From animals to animats
Building Terrain-Covering Ant Robots: A Feasibility Study

Autonomous Robots
The Impact of Communication Costs and Limitations on Price Wars in an Information Economy

AAMAS '04 Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems - Volume 3
Test Planning and Test Resource Optimization for Droplet-Based Microfluidic Systems

Journal of Electronic Testing: Theory and Applications
Autonomous shaping: knowledge transfer in reinforcement learning

ICML '06 Proceedings of the 23rd international conference on Machine learning
An intrinsic reward mechanism for efficient exploration

ICML '06 Proceedings of the 23rd international conference on Machine learning
Probabilistic policy reuse in a reinforcement learning agent

AAMAS '06 Proceedings of the fifth international joint conference on Autonomous agents and multiagent systems
The Two Facets of the Exploration-Exploitation Dilemma

IAT '06 Proceedings of the IEEE/WIC/ACM international conference on Intelligent Agent Technology
Memory exploitation in learning classifier systems

Evolutionary Computation
Tuning continual exploration in reinforcement learning: An optimality property of the Boltzmann strategy

Neurocomputing
Efficient Exploration in Reinforcement Learning Based on Utile Suffix Memory

Informatica
Robust Color-Based Skin Detection for an Interactive Robot

AI*IA '07 Proceedings of the 10th Congress of the Italian Association for Artificial Intelligence on AI*IA 2007: Artificial Intelligence and Human-Oriented Computing
Autonomous Parsing of Behavior in a Multi-agent Setting

ICAISC '08 Proceedings of the 9th international conference on Artificial Intelligence and Soft Computing
Effects of chaotic exploration on reinforcement learning in target capturing task

International Journal of Knowledge-based and Intelligent Engineering Systems
Using linear programming for Bayesian exploration in Markov decision processes

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Improving optimistic exploration in model-free reinforcement learning

ICANNGA'09 Proceedings of the 9th international conference on Adaptive and natural computing algorithms
Influence of different execution models on patrolling ant behaviors: from agents to robots

Proceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems: volume 3 - Volume 3
Reinforcement learning with perceptual aliasing: the perceptual distinctions approach

AAAI'92 Proceedings of the tenth national conference on Artificial intelligence
Adaptive ε-greedy exploration in reinforcement learning based on value differences

KI'10 Proceedings of the 33rd annual German conference on Advances in artificial intelligence
The complexity of grid coverage by swarm robotics

ANTS'10 Proceedings of the 7th international conference on Swarm intelligence
Efficient goal-directed exploration

AAAI'96 Proceedings of the thirteenth national conference on Artificial intelligence - Volume 1
Comparing a class of dynamic model-based reinforcement learning schemes for handoff prioritization in mobile communication networks

Expert Systems with Applications: An International Journal
Static and expanding grid coverage with ant robots: Complexity results

Theoretical Computer Science
Multi-agent Cooperative Cleaning of Expanding Domains

International Journal of Robotics Research
Value-difference based exploration: adaptive control between epsilon-greedy and softmax

KI'11 Proceedings of the 34th Annual German conference on Advances in artificial intelligence
Uncertainty and novelty-based selective attention in the collaborative exploration of unknown environments

EPIA'11 Proceedings of the 15th Portugese conference on Progress in artificial intelligence
Probabilistic exploration in planning while learning

UAI'95 Proceedings of the Eleventh conference on Uncertainty in artificial intelligence
Optimal tuning of continual online exploration in reinforcement learning

ICANN'06 Proceedings of the 16th international conference on Artificial Neural Networks - Volume Part I
Machine learning of plan robustness knowledge about instances

ECML'05 Proceedings of the 16th European conference on Machine Learning
Reinforcement learning by chaotic exploration generator in target capturing task

KES'05 Proceedings of the 9th international conference on Knowledge-Based Intelligent Information and Engineering Systems - Volume Part I
An extension of a hierarchical reinforcement learning algorithm for multiagent settings

EWRL'11 Proceedings of the 9th European conference on Recent Advances in Reinforcement Learning
Adaptive exploration using stochastic neurons

ICANN'12 Proceedings of the 22nd international conference on Artificial Neural Networks and Machine Learning - Volume Part II
Gradient algorithms for exploration/exploitation trade-offs: global and local variants

ANNPR'12 Proceedings of the 5th INNS IAPR TC 3 GIRPR conference on Artificial Neural Networks in Pattern Recognition
Knowledge-Based Exploration for Reinforcement Learning in Self-Organizing Neural Networks

WI-IAT '12 Proceedings of the The 2012 IEEE/WIC/ACM International Joint Conferences on Web Intelligence and Intelligent Agent Technology - Volume 02
Smart exploration in reinforcement learning using absolute temporal difference errors

Proceedings of the 2013 international conference on Autonomous agents and multi-agent systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

Exploration plays a fundamental role in any active learning system. This study evaluates the role of exploration in active learning and describes several local techniques for exploration in finite, discrete domains, embedded in a reinforcement learning framework (delayed reinforcement). This paper distinguishes between two families of exploration schemes: undirected and directed exploration. While the former family is closely related to random walk exploration, directed exploration techniques memorize exploration-specific knowledge which is used for guiding the exploration search. In many finite deterministic domains, any learning technique based on undirected exploration is inefficient in terms of learning time, i.e., learning time is expected to scale exponentially with the size of the state space. We prove that for all these domains, reinforcement learning using a directed technique can always be performed in polynomial time, demonstrating the important role of exploration in reinforcement learning. (The proof is given for one specific directed exploration technique named counter-based exploration.) Subsequently, several exploration techniques found in recent reinforcement learning and connectionist adaptive control literature are described. In order to trade off efficiently between exploration and exploitation --- a trade-off which characterizes many real-world active learning tasks --- combination methods are described which explore and avoid costs simultaneously. This includes a selective attention mechanism, which allows smooth switching between exploration and exploitation. All techniques are evaluated and compared on a discrete reinforcement learning task (robot navigation). The empirical evaluation is followed by an extensive discussion of benefits and limitations of this work.