Learning finite machines with self-clustering recurrent networks

Authors:
Zheng Zeng;Rodney M. Goodman;Padhraic Smyth
Affiliations:
-;-;-
Venue:
Neural Computation
Year:
1993

Citing 0
Cited 21

Constructing deterministic finite-state automata in recurrent neural networks

Journal of the ACM (JACM)
Analysis of dynamical recognizers

Neural Computation
Theory of neuromata

Journal of the ACM (JACM)
Discovering models of software processes from event-based data

ACM Transactions on Software Engineering and Methodology (TOSEM)
Grammatical Inference using an Adaptive Recurrent Neural Network

Neural Processing Letters
On the Internal Representations of Product Units

Neural Processing Letters
Artificial nonmonotonic neural networks

Artificial Intelligence
Natural Language Grammatical Inference with Recurrent Neural Networks

IEEE Transactions on Knowledge and Data Engineering
Robust Implementaion of Finite Automata by Recurrent RBF Networks

SOFSEM '00 Proceedings of the 27th Conference on Current Trends in Theory and Practice of Informatics
A machine learning method for extracting symbolic knowledge from recurrent neural networks

Neural Computation
Rule Extraction from Recurrent Neural Networks: A Taxonomy and Review

Neural Computation
Attractive Periodic Sets in Discrete-Time Recurrent Networks (with Emphasis on Fixed-Point Stability and Bifurcations in Two-Neuron Networks)

Neural Computation
Stable Encoding of Finite-State Machines in Discrete-Time Recurrent Neural Nets with Sigmoid Units

Neural Computation
A New Backpropagation Learning Algorithm for Layered Neural Networks with Nondifferentiable Units

Neural Computation
The Crystallizing Substochastic Sequential Machine Extractor: CrySSMEx

Neural Computation
Stable encoding of large finite-state automata in recurrent neural networks with sigmoid discriminants

Neural Computation
Extracting symbolic knowledge from recurrent neural networks---A fuzzy logic approach

Fuzzy Sets and Systems
A multiobjective genetic algorithm for obtaining the optimal size of a recurrent neural network for grammatical inference

Pattern Recognition
Identification of finite state automata with a class of recurrent neural networks

IEEE Transactions on Neural Networks
Representation and identification method of finite state automata by recurrent high-order neural networks

ICANN'05 Proceedings of the 15th international conference on Artificial neural networks: formal models and their applications - Volume Part II
CrySSMEx, a novel rule extractor for recurrent neural networks: overview and case study

ICANN'05 Proceedings of the 15th international conference on Artificial neural networks: formal models and their applications - Volume Part II

Quantified Score

Hi-index	0.00

Visualization

Abstract

Recent work has shown that recurrent neural networks have theability to learn finite state automata from examples. Inparticular, networks using second-order units have been successfulat this task. In studying the performance and learning behavior ofsuch networks we have found that the second-order network modelattempts to form clusters in activation space as its internalrepresentation of states. However, these learned states becomeunstable as longer and longer test input strings are presented tothe network. In essence, the network forgets where the individualstates are in activation space. In this paper we propose a newmethod to force such a network to learn stable states byintroducing discretization into the network and using apseudo-gradient learning rule to perform training. The essence ofthe learning rule is that in doing gradient descent, it makes useof the gradient of a sigmoid function as a heuristic hint in placeof that of the hard-limiting function, while still using thediscretized value in the feedback update path. The new structureuses isolated points in activation space instead of vague clustersas its internal representation of states. It is shown to havesimilar capabilities in learning finite state automata as theoriginal network, but without the instability problem. The proposedpseudo-gradient learning rule may also be used as a basis fortraining other types of networks that have hard-limiting thresholdactivation functions.