Controlling hidden layer capacity through lateral connections

Authors:
Kwabena Agyepong;Ravi Kothari
Affiliations:
Artificial Neural Systems Laboratory, Department of Electrical and Computer Engineering and Computer Science, University of Cincinnati, Cincinnati, OH 45221-0030, U.S.A;Artificial Neural Systems Laboratory, Department of Electrical and Computer Engineering and Computer Science, University of Cincinnati, Cincinnati, OH 45221-0030, U.S.A.
Venue:
Neural Computation
Year:
1997

Citing 9
Cited 0

What size net gives valid generalization?

Neural Computation
Skeletonization: a technique for trimming the fat from a network via relevance assessment

Advances in neural information processing systems 1
The cascade-correlation learning architecture

Advances in neural information processing systems 2
Optimal brain damage

Advances in neural information processing systems 2
Dynamic behavior of constrained back propagation networks

Advances in neural information processing systems 2
Generalization by weight-elimination with application to forecasting

NIPS-3 Proceedings of the 1990 conference on Advances in neural information processing systems 3
Simplifying neural networks by soft weight-sharing

Neural Computation
Neural Networks for Pattern Recognition

Neural Networks for Pattern Recognition
Second Order Derivatives for Network Pruning: Optimal Brain Surgeon

Advances in Neural Information Processing Systems 5, [NIPS Conference]

Quantified Score

Hi-index	0.00

Visualization

Abstract

We investigate the effects of including selected lateral interconnections in a feedforward neural network. In a network with one hidden layer consisting of m hidden neurons labeled 1,2... m, hidden neuron j is connected fully to the inputs, the outputs, and hidden neuron j + 1. As a consequence of the lateral connections, each hidden neuron receives two error signals: one from the output layer and one through the lateral interconnection. We show that the use of these lateral interconnections among the hidden-layer neurons facilitates controlled assignment of role and specialization of the hidden-layer neurons. In particular, we show that as training progresses, hidden neurons become progressively specialized---starting from the fringes (i.e., lower and higher numbered hidden neurons, e.g., 1, 2, m --- 1 m) and leaving the neurons in the center of the hidden layer (i.e., hidden-layer neurons numbered close to m/2) unspecialized or functionally identical. Consequently, the network behaves like network growing algorithms without the explicit need to add hidden units, and like soft weight sharing due to functionally identical neurons in the center of the hidden layer. Experimental results from one classification and one function approximation problems are presented to illustrate selective specialization of the hidden-layer neurons. In addition, the improved generalization that results from a decrease in the effective number of free parameters is illustrated through a simple function approximation example and with a real-world data set. Besides the reduction in the number of free parameters, the localization of weight sharing may also allow for a method that allows procedural determination for the number of hidden-layer neurons required for a given learning task.