Combination of supervised and unsupervised learning for training the activation functions of neural networks

  • Authors:
  • Ilaria Castelli;Edmondo Trentin

  • Affiliations:
  • -;-

  • Venue:
  • Pattern Recognition Letters
  • Year:
  • 2014

Quantified Score

Hi-index 0.10

Visualization

Abstract

Standard feedforward neural networks benefit from the nice theoretical properties of mixtures of sigmoid activation functions, but they may fail in several practical learning tasks. These tasks would be better faced by relying on a more appropriate, problem-specific basis of activation functions. The paper presents a connectionist model which exploits adaptive activation functions. Each hidden unit in the network is associated with a specific pair (f(.),p(.)), where f(.) is the activation function and p(.) is the likelihood of the unit being relevant to the computation of the network output over the current input. The function f(.) is optimized in a supervised manner, while p(.) is realized via a statistical parametric model learned through unsupervised (or, partially supervised) estimation. Since f(.) and p(.) influence each other's learning process, the overall machine is implicitly a co-trained coupled model and, in turn, a flexible, non-standard neural architecture. Feasibility of the approach is corroborated by empirical evidence yielded by computer simulations involving regression and classification tasks.