Architectural bias in recurrent neural networks: fractal analysis

Authors:
Peter Tiňo;Barbara Hammer
Affiliations:
Aston University, Birmingham B4 7ET, U. K.;University of Osnabrück, D-49069 Osnabrück, Germany
Venue:
Neural Computation
Year:
2003

Citing 20
Cited 7

Fractals everywhere

Fractals everywhere
Learning sequential structure in simple recurrent networks

Advances in neural information processing systems 1
Automata on infinite objects

Handbook of theoretical computer science (vol. B)
Learning and extracting finite state automata with second-order recurrent neural networks

Neural Computation
Induction of finite-state languages using second-order recurrent networks

Neural Computation
Escape-time visualization method for language-restricted iterated function systems

Proceedings of the conference on Graphics interface '92
Affine automata and related techniques for generation of complex images

Theoretical Computer Science
Learning and extracting initial mealy automata with a modular neural network model

Neural Computation
Learning the initial state of a second-order recurrent neural network during regular-language inference

Neural Computation
Representation of finite state automata in recurrent radial basis function networks

Machine Learning
Analysis of dynamical recognizers

Neural Computation
Iterated function systems and control languages

Information and Computation
Valuations and Unambiguity of Languages, with Applications to Fractal Geometry

ICALP '94 Proceedings of the 21st International Colloquium on Automata, Languages and Programming
Learning the Dynamics of Embedded Clauses

Applied Intelligence
Attractive Periodic Sets in Discrete-Time Recurrent Networks (with Emphasis on Fixed-Point Stability and Bifurcations in Two-Neuron Networks)

Neural Computation
Spatiotemporal Connectionist Networks: A Taxonomy and Review

Neural Computation
The dynamics of discrete-time computation, with application to recurrent neural networks and finite state machine extraction

Neural Computation
Recurrent networks for structured data - A unifying approach and its properties

Cognitive Systems Research
A general framework for adaptive processing of data structures

IEEE Transactions on Neural Networks
On learning context-free and context-sensitive languages

IEEE Transactions on Neural Networks

The Applicability of Recurrent Neural Networks for Biological Sequence Analysis

IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Rule Extraction from Recurrent Neural Networks: A Taxonomy and Review

Neural Computation
Dynamics and Topographic Organization of Recursive Self-Organizing Maps

Neural Computation
Organization of the state space of a simple recurrent network before and after training on recursive linguistic structures

Neural Networks
Analysis and Visualization of the Dynamics of Recurrent Neural Networks for Symbolic Sequences Processing

ICANN '08 Proceedings of the 18th international conference on Artificial Neural Networks, Part II
A database of glyphs for OCR of mathematical documents

MKM'05 Proceedings of the 4th international conference on Mathematical Knowledge Management
Tree Echo State Networks

Neurocomputing

Quantified Score

Hi-index	0.00

Visualization

Abstract

We have recently shown that when initialized with "small" weights, recurrent neural networks (RNNs) with standard sigmoid-type activation functions are inherently biased toward Markov models; even prior to any training, RNN dynamics can be readily used to extract finite memory machines (Hammer & Tino, 2002; Tino, Cernanský, & Benusková, 2002a, 2002b). Following Christiansen and Chater (1999), we refer to this phenomenon as the architectural bias of RNNs. In this article, we extend our work on the architectural bias in RNNs by performing a rigorous fractal analysis of recurrent activation patterns. We assume the network is driven by sequences obtained by traversing an underlying finite-state transition diagram--a scenario that has been frequently considered in the past, for example, when studying RNN-based learning and implementation of regular grammars and finite-state transducers. We obtain lower and upper bounds on various types of fractal dimensions, such as box counting and Hausdorff dimensions. It turns out that not only can the recurrent activations inside RNNs with small initial weights be explored to build Markovian predictive models, but also the activations form fractal clusters, the dimension of which can be bounded by the scaled entropy of the underlying driving source. The scaling factors are fixed and are given by the RNN parameters.