Self-organization and associative memory: 3rd edition
Self-organization and associative memory: 3rd edition
What size net gives valid generalization?
Neural Computation
The cascade-correlation learning architecture
Advances in neural information processing systems 2
Advances in neural information processing systems 2
Bumptrees for efficient function, constraint, and classification learning
NIPS-3 Proceedings of the 1990 conference on Advances in neural information processing systems 3
Evaluation of adaptive mixtures of competing experts
NIPS-3 Proceedings of the 1990 conference on Advances in neural information processing systems 3
Neural computation and self-organizing maps: an introduction
Neural computation and self-organizing maps: an introduction
An Algorithm for Finding Best Matches in Logarithmic Expected Time
ACM Transactions on Mathematical Software (TOMS)
A two-dimensional interpolation function for irregularly-spaced data
ACM '68 Proceedings of the 1968 23rd ACM national conference
Predicting the future: Advantages of semilocal units
Neural Computation
Adaptive mixtures of local experts
Neural Computation
Active learning in neural networks
New learning paradigms in soft computing
Neural network for graphs: a contextual constructive approach
IEEE Transactions on Neural Networks
Hi-index | 0.00 |
Incrementally constructed cascade architectures are a promising alternative to networks of predefined size. This paper compares the direct cascade architecture (DCA) proposed in Littmann and Ritter (1992) to the cascade-correlation approach of Fahlman and Lebiere (1990) and to related approaches and discusses the properties on the basis of various benchmark results. One important virtue of DCA is that it allows the cascading of entire subnetworks, even if these admit no error-backpropagation. Exploiting this flexibility and using LLM networks as cascaded elements, we show that the performance of the resulting network cascades can be greatly enhanced compared to the performance of a single network. Our results for the Mackey-Glass time series prediction task indicate that such deeply cascaded network architectures achieve good generalization even on small data sets, when shallow, broad architectures of comparable size suffer from overfitting. We conclude that the DCA approach offers a powerful and flexible alternative to existing schemes such as, e.g., the mixtures of experts approach, for the construction of modular systems from a wide range of subnetwork types.