Multilayer feedforward networks are universal approximators
Neural Networks
What size net gives valid generalization?
Neural Computation
Decision theoretic generalizations of the PAC model for neural net and other learning applications
Information and Computation
Feedforward nets for interpolation and classification
Journal of Computer and System Sciences
Fat-shattering and the learnability of real-valued functions
COLT '94 Proceedings of the seventh annual conference on Computational learning theory
Analog computation via neural networks
Theoretical Computer Science
Some new results on neural network approximation
Neural Networks
Discrete Sequence Prediction and Its Applications
Machine Learning
On the computational power of neural nets
Journal of Computer and System Sciences
Polynomial bounds for VC dimension of sigmoidal neural networks
STOC '95 Proceedings of the twenty-seventh annual ACM symposium on Theory of computing
Constructing deterministic finite-state automata in recurrent neural networks
Journal of the ACM (JACM)
The power of amnesia: learning probabilistic automata with variable memory length
Machine Learning - Special issue on COLT '94
Self-organizing maps
Neural Computation
On the effect of analog noise in discrete-time analog computations
Neural Computation
A view of the EM algorithm that justifies incremental, sparse, and other variants
Learning in graphical models
A Theory of Learning and Generalization: With Applications to Neural Networks and Control Systems
A Theory of Learning and Generalization: With Applications to Neural Networks and Control Systems
Learning in Neural Networks: Theoretical Foundations
Learning in Neural Networks: Theoretical Foundations
Unified Integration of Explicit Knowledge and Learning by Example in Recurrent Networks
IEEE Transactions on Knowledge and Data Engineering
Simple Strategies to Encode Tree Automata in Sigmoid Recursive Neural Networks
IEEE Transactions on Knowledge and Data Engineering
Generalization Ability of Folding Networks
IEEE Transactions on Knowledge and Data Engineering
Two Methods for Improving Performance of a HMM and their Application for Gene Finding
Proceedings of the 5th International Conference on Intelligent Systems for Molecular Biology
Architectural Bias in Recurrent Neural Networks - Fractal Analysis
ICANN '02 Proceedings of the International Conference on Artificial Neural Networks
Generalization of Elman Networks
ICANN '97 Proceedings of the 7th International Conference on Artificial Neural Networks
Vapnik-Chervonenkis Dimension of Recurrent Neural Networks
EuroCOLT '97 Proceedings of the Third European Conference on Computational Learning Theory
Design of a linguistic postprocessor using variable memory length Markov models
ICDAR '95 Proceedings of the Third International Conference on Document Analysis and Recognition (Volume 1) - Volume 1
Input-output HMMs for sequence processing
IEEE Transactions on Neural Networks
Time-delay neural networks: representation and induction of finite-state machines
IEEE Transactions on Neural Networks
The Applicability of Recurrent Neural Networks for Biological Sequence Analysis
IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Recursive self-organizing network models
Neural Networks - 2004 Special issue: New developments in self-organizing systems
Rule Extraction from Recurrent Neural Networks: A Taxonomy and Review
Neural Computation
Dynamics and Topographic Organization of Recursive Self-Organizing Maps
Neural Computation
The Crystallizing Substochastic Sequential Machine Extractor: CrySSMEx
Neural Computation
Elman Backpropagation as Reinforcement for Simple Recurrent Networks
Neural Computation
ICANN '08 Proceedings of the 18th international conference on Artificial Neural Networks, Part II
Advances in Artificial Neural Systems
Group-Linking Method: A Unified Benchmark for Machine Learning with Recurrent Neural Network
IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences
On the weight convergence of Elman networks
IEEE Transactions on Neural Networks
Neurocomputing
Quantized Neural Modeling: Hybrid Quantized Architecture in Elman Networks
Neural Processing Letters
Hi-index | 0.00 |
Recent experimental studies indicate that recurrent neural networks initialized with "small" weights are inherently biased toward definite memory machines (Tino, Cernansky, & Benusková, 2002a, 2002b). This article establishes a theoretical counterpart: transition function of recurrent network with small weights and squashing activation function is a contraction. We prove that recurrent networks with contractive transition function can be approximated arbitrarily well on input sequences of unbounded length by a definite memory machine. Conversely, every definite memory machine can be simulated by a recurrent network with contractive transition function. Hence, initialization with small weights induces an architectural bias into learning with recurrent neural networks. This bias might have benefits from the point of view of statistical learning theory: it emphasizes one possible region of the weight space where generalization ability can be formally proved. It is well known that standard recurrent neural networks are not distribution independent learnable in the probably approximately correct (PAC) sense if arbitrary precision and inputs are considered. We prove that recurrent networks with contractive transition function with a fixed contraction parameter fulfill the so-called distribution independent uniform convergence of empirical distances property and hence, unlike general recurrent networks, are distribution independent PAC learnable.