Learning finite machines with self-clustering recurrent networks

  • Authors:
  • Zheng Zeng;Rodney M. Goodman;Padhraic Smyth

  • Affiliations:
  • -;-;-

  • Venue:
  • Neural Computation
  • Year:
  • 1993

Quantified Score

Hi-index 0.00

Visualization

Abstract

Recent work has shown that recurrent neural networks have theability to learn finite state automata from examples. Inparticular, networks using second-order units have been successfulat this task. In studying the performance and learning behavior ofsuch networks we have found that the second-order network modelattempts to form clusters in activation space as its internalrepresentation of states. However, these learned states becomeunstable as longer and longer test input strings are presented tothe network. In essence, the network forgets where the individualstates are in activation space. In this paper we propose a newmethod to force such a network to learn stable states byintroducing discretization into the network and using apseudo-gradient learning rule to perform training. The essence ofthe learning rule is that in doing gradient descent, it makes useof the gradient of a sigmoid function as a heuristic hint in placeof that of the hard-limiting function, while still using thediscretized value in the feedback update path. The new structureuses isolated points in activation space instead of vague clustersas its internal representation of states. It is shown to havesimilar capabilities in learning finite state automata as theoriginal network, but without the instability problem. The proposedpseudo-gradient learning rule may also be used as a basis fortraining other types of networks that have hard-limiting thresholdactivation functions.