Agent's actions as a classification criteria for the state space in a learning from rewards system

Authors:
Francisco Martinez-gil
Affiliations:
Departament d'Informatica, Universitat de Valencia, Avda, Valencia, Spain
Venue:
Journal of Experimental & Theoretical Artificial Intelligence
Year:
2008

Citing 8
Cited 0

Complexity–Based Induction

Machine Learning
Locally Weighted Learning

Artificial Intelligence Review - Special issue on lazy learning
Stochastic dynamic programming with factored representations

Artificial Intelligence
Introduction to Reinforcement Learning

Introduction to Reinforcement Learning
Integrated learning for interactive synthetic characters

Proceedings of the 29th annual conference on Computer graphics and interactive techniques
Expectation Maximization for Weakly Labeled Data

ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
An approach to incremental SVM learning algorithm

ICTAI '00 Proceedings of the 12th IEEE International Conference on Tools with Artificial Intelligence
Developmental Learning of Memory-Based Perceptual Models

ICDL '02 Proceedings of the 2nd International Conference on Development and Learning

Quantified Score

Hi-index	0.00

Visualization

Abstract

We focus in this paper on the problem of learning an autonomous agent's policy when the state space is very large and the set of actions available is comparatively short. To this end, we use a non-parametric decision rule (concretely, a nearest-neighbour strategy) in order to cluster the state space by means of the action that leads to a successful situation. Using an exploration strategy to avoid greedy behaviour, the agent builds clusters of positively-classified states through trial and error learning. In this paper, we implement a 3D synthetic agent which plays an 'avoid the asteroid' game that suits our assumptions. Using as the state space a feature vector space extracted from a visual navigation system, we test two exploration strategies using the trial and error learning method. This experiment shows that the agent is a good classifier over the state space, and will therefore show good behaviour in its synthetic world.