ARKAQ-learning: autonomous state space segmentation and policy generation

Authors:
Alp Sardağ;H. Levent Akın
Affiliations:
Department of Computer Engineering, Boğaziçi University, Bebek, Istanbul, Turkey;Department of Computer Engineering, Boğaziçi University, Bebek, Istanbul, Turkey
Venue:
ISCIS'05 Proceedings of the 20th international conference on Computer and Information Sciences
Year:
2005

Citing 12
Cited 0

Dynamic programming: deterministic and stochastic models

Dynamic programming: deterministic and stochastic models
A massively parallel architecture for a self-organizing neural pattern recognition machine

Computer Vision, Graphics, and Image Processing
Learning internal representations by error propagation

Parallel distributed processing: explorations in the microstructure of cognition, vol. 1
Art 2-A: an adaptive resonance algorithm for rapid category learning and recognition

Neural Networks
Kalman filtering: theory and practice

Kalman filtering: theory and practice
Improved switching among temporally abstract actions

Proceedings of the 1998 conference on Advances in neural information processing systems II
Programming backgammon using self-teaching neural nets

Artificial Intelligence - Chips challenging champions: games, computers and Artificial Intelligence
Learning from Scarce Experience

ICML '02 Proceedings of the Nineteenth International Conference on Machine Learning
Learning Policies with External Memory

ICML '99 Proceedings of the Sixteenth International Conference on Machine Learning
Memory Approaches to Reinforcement Learning in Non-Markovian Domains

Memory Approaches to Reinforcement Learning in Non-Markovian Domains
Reinforcement learning: a survey

Journal of Artificial Intelligence Research
Tractable inference for complex stochastic processes

UAI'98 Proceedings of the Fourteenth conference on Uncertainty in artificial intelligence

Quantified Score

Hi-index	0.00

Visualization

Abstract

A real world environment is often partially observable by the agents either because of noisy sensors or incomplete perception. Autonomous strategy planning under uncertainty has two major challenges. First, autonomous segmentation of the state space for a given task; Second, emerging complex behaviors that deal with each state segment. This paper suggests a new approach that handles both by utilizing combination of various techniques, namely ARKAQ-Learning (ART 2-A networks augmented with Kalman Filters and Q-Learning). The algorithm is an online algorithm and it has low space and computational complexity. The algorithm was run for some well known partially observable Markov decision process problems. World Model Generator could reveal the hidden states, mapping non-Markovian model to Markovian internal state space. Policy Generator could build the optimal policy on the internal Markovian state model.