On the Use of Option Policies for Autonomous Robot Navigation

Authors:
Carlos H. C. Ribeiro
Affiliations:
-
Venue:
IBERAMIA-SBIA '00 Proceedings of the International Joint Conference, 7th Ibero-American Conference on AI: Advances in Artificial Intelligence
Year:
2000

Citing 6
Cited 0

Integrated architecture for learning, planning, and reacting based on approximating dynamic programming

Proceedings of the seventh international conference (1990) on Machine learning
Dynamic Programming and Optimal Control, Two Volume Set

Dynamic Programming and Optimal Control, Two Volume Set
Between MOPs and Semi-MOP: Learning, Planning & Representing Knowledge at Multiple Temporal Scales

Between MOPs and Semi-MOP: Learning, Planning & Representing Knowledge at Multiple Temporal Scales
Interaction and intelligent behavior

Interaction and intelligent behavior
On the convergence of stochastic iterative dynamic programming algorithms

Neural Computation
Rapid, safe, and incremental learning of navigation strategies

IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics

Quantified Score

Hi-index	0.00

Visualization

Abstract

We present here results and analysis on the use of fixed-duration option policies for navigation tasks in autonomous robotics. An option is sequence of actions taken by the robot without environmental feedback (open-loop control). Using options in replacement for actions leads to a more aggressive exploration of the state space, a convenient feature for tasks where autonomous learning of state trajectories is slow, such as mobile robot navigation. On the other hand, long sequences of actions taken in open loop can be dangerous, and from the point of view of learning can be counterproductive due to the exponential increase in the size of the policy space. We shown here that conservative options (corresponding to short sequences of actions) can be very effective, specially if their improved generalisation capabilities are combined with other mechanisms for increasing the generalisation efficiency of autonomous learning algorithms.