Statistical theory of learning curves under entropic loss criterion
Neural Computation
Natural gradient works efficiently in learning
Neural Computation
On the Problem in Model Selection of Neural Network Regression in Overrealizable Scenario
IJCNN '00 Proceedings of the IEEE-INNS-ENNS International Joint Conference on Neural Networks (IJCNN'00)-Volume 6 - Volume 6
Algebraic Analysis for Nonidentifiable Learning Machines
Neural Computation
Hi-index | 0.00 |
The neuromanifold or the parameter space of multila yer perceptrons includes complex singularities at which the Fisher information matrix degenerates. The parameters are unidentifiable at singularities, and this causes serious difficulties in learning, known as plateaus in the cost function. The natural or adaptive natural gradient method is proposed for overcoming this difficulty. It is important to study the relation between the generalization error and and the training error at the singularities, because the generalization error is estimated in terms of the training error. The generalization error is studied both for the maximum likelihood estimator (mle) and the Bayesian predictive distribution estimator in terms of the Gaussian random field, by using a simple model. This elucidates the strange behaviors of learning dynamics around singularities.