Generalization Error and Training Error at Singularities of Multilayer Perceptrons

  • Authors:
  • Shun-ichi Amari;Tomoko Ozeki;Hyeyoung Park

  • Affiliations:
  • -;-;-

  • Venue:
  • IWANN '01 Proceedings of the 6th International Work-Conference on Artificial and Natural Neural Networks: Connectionist Models of Neurons, Learning Processes and Artificial Intelligence-Part I
  • Year:
  • 2001

Quantified Score

Hi-index 0.00

Visualization

Abstract

The neuromanifold or the parameter space of multila yer perceptrons includes complex singularities at which the Fisher information matrix degenerates. The parameters are unidentifiable at singularities, and this causes serious difficulties in learning, known as plateaus in the cost function. The natural or adaptive natural gradient method is proposed for overcoming this difficulty. It is important to study the relation between the generalization error and and the training error at the singularities, because the generalization error is estimated in terms of the training error. The generalization error is studied both for the maximum likelihood estimator (mle) and the Bayesian predictive distribution estimator in terms of the Gaussian random field, by using a simple model. This elucidates the strange behaviors of learning dynamics around singularities.