Heuristic search based exploration in reinforcement learning

Authors:
Ngo Anh Vien;Nguyen Hoang Viet;SeungGwan Lee;TaeChoong Chung
Affiliations:
Artificial Intelligence Lab, Department of Computer Engineering, School of Electronics and Information, Kyunghee University, Yongin, Gyeonggi, South Korea;Artificial Intelligence Lab, Department of Computer Engineering, School of Electronics and Information, Kyunghee University, Yongin, Gyeonggi, South Korea;Artificial Intelligence Lab, Department of Computer Engineering, School of Electronics and Information, Kyunghee University, Yongin, Gyeonggi, South Korea;Artificial Intelligence Lab, Department of Computer Engineering, School of Electronics and Information, Kyunghee University, Yongin, Gyeonggi, South Korea
Venue:
IWANN'07 Proceedings of the 9th international work conference on Artificial neural networks
Year:
2007

Citing 7
Cited 5

The truck backer-upper: an example of self-learning in neural networks

Neural networks for control
Planning with an adaptive world model

NIPS-3 Proceedings of the 1990 conference on Advances in neural information processing systems 3
Adaptation in natural and artificial systems

Adaptation in natural and artificial systems
Simulated annealing

Modern heuristic techniques for combinatorial problems
Introduction to Reinforcement Learning

Introduction to Reinforcement Learning
Reinforcement learning: a survey

Journal of Artificial Intelligence Research
Bandit problems and the exploration/exploitation tradeoff

IEEE Transactions on Evolutionary Computation

Temporal difference learning and simulated annealing for optimal control: a case study

KES-AMSTA'08 Proceedings of the 2nd KES International conference on Agent and multi-agent systems: technologies and applications
A new Q-learning with generalized approximation spaces

ICNC'09 Proceedings of the 5th international conference on Natural computation
Could human intelligence enhance communication opportunities in mission-oriented opportunistic networks?

Proceedings of the first ACM international workshop on Mission-oriented wireless sensor networking
Knowledge-Based Exploration for Reinforcement Learning in Self-Organizing Neural Networks

WI-IAT '12 Proceedings of the The 2012 IEEE/WIC/ACM International Joint Conferences on Web Intelligence and Intelligent Agent Technology - Volume 02
Distributed dynamic data driven prediction based on reinforcement learning approach

Proceedings of the 28th Annual ACM Symposium on Applied Computing

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper, we consider reinforcement learning in systems with unknown environment where the agent must trade off efficiently between: exploration(long-term optimization) and exploitation (short-term optimization). Ɛ-greedy algorithm is a method using near-greedy action selection rule. It behaves greedily (exploitation) most of the time, but every once in a while, say with small probability Ɛ (exploration), instead select an action at random. Many works already proved that random exploration drives the agent towards poorly modeled states. Therefore, this study evaluates the role of heuristic based exploration in reinforcement learning. We proposed three methods: neighborhood search based exploration, simulated annealing based exploration, and tabu search based exploration. All techniques follow the same rule "Explore the most unvisited state". In the simulation, these techniques are evaluated and compared on a discrete reinforcement learning task (robot navigation).