Controlling hidden layer capacity through lateral connections

  • Authors:
  • Kwabena Agyepong;Ravi Kothari

  • Affiliations:
  • Artificial Neural Systems Laboratory, Department of Electrical and Computer Engineering and Computer Science, University of Cincinnati, Cincinnati, OH 45221-0030, U.S.A;Artificial Neural Systems Laboratory, Department of Electrical and Computer Engineering and Computer Science, University of Cincinnati, Cincinnati, OH 45221-0030, U.S.A.

  • Venue:
  • Neural Computation
  • Year:
  • 1997

Quantified Score

Hi-index 0.00

Visualization

Abstract

We investigate the effects of including selected lateral interconnections in a feedforward neural network. In a network with one hidden layer consisting of m hidden neurons labeled 1,2... m, hidden neuron j is connected fully to the inputs, the outputs, and hidden neuron j + 1. As a consequence of the lateral connections, each hidden neuron receives two error signals: one from the output layer and one through the lateral interconnection. We show that the use of these lateral interconnections among the hidden-layer neurons facilitates controlled assignment of role and specialization of the hidden-layer neurons. In particular, we show that as training progresses, hidden neurons become progressively specialized---starting from the fringes (i.e., lower and higher numbered hidden neurons, e.g., 1, 2, m --- 1 m) and leaving the neurons in the center of the hidden layer (i.e., hidden-layer neurons numbered close to m/2) unspecialized or functionally identical. Consequently, the network behaves like network growing algorithms without the explicit need to add hidden units, and like soft weight sharing due to functionally identical neurons in the center of the hidden layer. Experimental results from one classification and one function approximation problems are presented to illustrate selective specialization of the hidden-layer neurons. In addition, the improved generalization that results from a decrease in the effective number of free parameters is illustrated through a simple function approximation example and with a real-world data set. Besides the reduction in the number of free parameters, the localization of weight sharing may also allow for a method that allows procedural determination for the number of hidden-layer neurons required for a given learning task.