Recurrent neural networks with small weights implement definite memory machines

Authors:
Barbara Hammer;Peter Tiňo
Affiliations:
Department of Mathematics/Computer Science, University of Osnabrück, D-49069, Osnabrück, Germany;School of Computer Science, University of Birmingham, Edgbaston, Birmingham B15 2TT, U. K.
Venue:
Neural Computation
Year:
2003

Citing 34
Cited 16

Multilayer feedforward networks are universal approximators

Neural Networks
What size net gives valid generalization?

Neural Computation
Decision theoretic generalizations of the PAC model for neural net and other learning applications

Information and Computation
Feedforward nets for interpolation and classification

Journal of Computer and System Sciences
Original Contribution: Approximation of dynamical systems by continuous time recurrent neural networks

Neural Networks
Fat-shattering and the learnability of real-valued functions

COLT '94 Proceedings of the seventh annual conference on Computational learning theory
Analog computation via neural networks

Theoretical Computer Science
Some new results on neural network approximation

Neural Networks
Discrete Sequence Prediction and Its Applications

Machine Learning
Learning and extracting initial mealy automata with a modular neural network model

Neural Computation
On the computational power of neural nets

Journal of Computer and System Sciences
Polynomial bounds for VC dimension of sigmoidal neural networks

STOC '95 Proceedings of the twenty-seventh annual ACM symposium on Theory of computing
Constructing deterministic finite-state automata in recurrent neural networks

Journal of the ACM (JACM)
Learning a class of large finite state machines with a recurrent neural network

Neural Networks
The power of amnesia: learning probabilistic automata with variable memory length

Machine Learning - Special issue on COLT '94
Self-organizing maps

Self-organizing maps
Long short-term memory

Neural Computation
On the effect of analog noise in discrete-time analog computations

Neural Computation
A view of the EM algorithm that justifies incremental, sparse, and other variants

Learning in graphical models
Analog neural nets with Gaussian or other common noise distributions cannot recognize arbitary regular languages

Neural Computation
Predicting the Future of Discrete Sequences from Fractal Representations of the Past

Machine Learning
A Theory of Learning and Generalization: With Applications to Neural Networks and Control Systems

A Theory of Learning and Generalization: With Applications to Neural Networks and Control Systems
Learning in Neural Networks: Theoretical Foundations

Learning in Neural Networks: Theoretical Foundations
Unified Integration of Explicit Knowledge and Learning by Example in Recurrent Networks

IEEE Transactions on Knowledge and Data Engineering
Simple Strategies to Encode Tree Automata in Sigmoid Recursive Neural Networks

IEEE Transactions on Knowledge and Data Engineering
Generalization Ability of Folding Networks

IEEE Transactions on Knowledge and Data Engineering
Two Methods for Improving Performance of a HMM and their Application for Gene Finding

Proceedings of the 5th International Conference on Intelligent Systems for Molecular Biology
Architectural Bias in Recurrent Neural Networks - Fractal Analysis

ICANN '02 Proceedings of the International Conference on Artificial Neural Networks
Generalization of Elman Networks

ICANN '97 Proceedings of the 7th International Conference on Artificial Neural Networks
Vapnik-Chervonenkis Dimension of Recurrent Neural Networks

EuroCOLT '97 Proceedings of the Third European Conference on Computational Learning Theory
Design of a linguistic postprocessor using variable memory length Markov models

ICDAR '95 Proceedings of the Third International Conference on Document Analysis and Recognition (Volume 1) - Volume 1
Stable encoding of large finite-state automata in recurrent neural networks with sigmoid discriminants

Neural Computation
Input-output HMMs for sequence processing

IEEE Transactions on Neural Networks
Time-delay neural networks: representation and induction of finite-state machines

IEEE Transactions on Neural Networks

The Applicability of Recurrent Neural Networks for Biological Sequence Analysis

IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Recursive self-organizing network models

Neural Networks - 2004 Special issue: New developments in self-organizing systems
Rule Extraction from Recurrent Neural Networks: A Taxonomy and Review

Neural Computation
Dynamics and Topographic Organization of Recursive Self-Organizing Maps

Neural Computation
Organization of the state space of a simple recurrent network before and after training on recursive linguistic structures

Neural Networks
The Crystallizing Substochastic Sequential Machine Extractor: CrySSMEx

Neural Computation
2007 Special Issue: Online reservoir adaptation by intrinsic plasticity for backpropagation-decorrelation and echo state learning

Neural Networks
Elman Backpropagation as Reinforcement for Simple Recurrent Networks

Neural Computation
A new boosting algorithm for improved time-series forecasting with recurrent neural networks

Information Fusion
Analysis and Visualization of the Dynamics of Recurrent Neural Networks for Symbolic Sequences Processing

ICANN '08 Proceedings of the 18th international conference on Artificial Neural Networks, Part II
A model for learning to segment temporal sequences, utilizing a mixture of RNN experts together with adaptive variance

Neural Networks
Building recurrent neural networks to implement multiple attractor dynamics using the gradient descent method

Advances in Artificial Neural Systems
Group-Linking Method: A Unified Benchmark for Machine Learning with Recurrent Neural Network

IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences
On the weight convergence of Elman networks

IEEE Transactions on Neural Networks
Tree Echo State Networks

Neurocomputing
Quantized Neural Modeling: Hybrid Quantized Architecture in Elman Networks

Neural Processing Letters

Quantified Score

Hi-index	0.00

Visualization

Abstract

Recent experimental studies indicate that recurrent neural networks initialized with "small" weights are inherently biased toward definite memory machines (Tino, Cernansky, & Benusková, 2002a, 2002b). This article establishes a theoretical counterpart: transition function of recurrent network with small weights and squashing activation function is a contraction. We prove that recurrent networks with contractive transition function can be approximated arbitrarily well on input sequences of unbounded length by a definite memory machine. Conversely, every definite memory machine can be simulated by a recurrent network with contractive transition function. Hence, initialization with small weights induces an architectural bias into learning with recurrent neural networks. This bias might have benefits from the point of view of statistical learning theory: it emphasizes one possible region of the weight space where generalization ability can be formally proved. It is well known that standard recurrent neural networks are not distribution independent learnable in the probably approximately correct (PAC) sense if arbitrary precision and inputs are considered. We prove that recurrent networks with contractive transition function with a fixed contraction parameter fulfill the so-called distribution independent uniform convergence of empirical distances property and hence, unlike general recurrent networks, are distribution independent PAC learnable.