Neural Computation
Neural Computation
A universal theorem on learning curves
Neural Networks
An introduction to wavelets
Statistical theory of learning curves under entropic loss criterion
Neural Computation
Approximation and Estimation Bounds for Artificial Neural Networks
Machine Learning - Special issue on computational learning theory
Algebraic Analysis for Nonidentifiable Learning Machines
Neural Computation
A decision-theoretic extension of stochastic complexity and its applications to learning
IEEE Transactions on Information Theory
Learning efficiency of redundant neural networks in Bayesian estimation
IEEE Transactions on Neural Networks
Probabilistic design of layered neural networks based on their unified framework
IEEE Transactions on Neural Networks
Difficulty of Singularity in Population Coding
Neural Computation
Singularities Affect Dynamics of Learning in Neuromanifolds
Neural Computation
Dynamics of learning near singularities in layered networks
Neural Computation
MDAI '07 Proceedings of the 4th international conference on Modeling Decisions for Artificial Intelligence
Singularity and Slow Convergence of the EM algorithm for Gaussian Mixtures
Neural Processing Letters
Accuracy of Loopy belief propagation in Gaussian models
Neural Networks
Equations of states in singular statistical estimation
Neural Networks
Estimation of poles of zeta function in learning theory using Padé approximation
ICANN'07 Proceedings of the 17th international conference on Artificial neural networks
Experimental study of ergodic learning curve in hidden Markov models
ICONIP'08 Proceedings of the 15th international conference on Advances in neuro-information processing - Volume Part I
The Journal of Machine Learning Research
The Journal of Machine Learning Research
Stochastic complexity of bayesian networks
UAI'03 Proceedings of the Nineteenth conference on Uncertainty in Artificial Intelligence
A widely applicable Bayesian information criterion
The Journal of Machine Learning Research
Hi-index | 0.00 |
Hierarchical learning machines such as layered perceptrons, radial basis functions, Gaussian mixtures are non-identifiable learning machines, whose Fisher information matrices are not positive definite. This fact shows that conventional statistical asymptotic theory cannot be applied to neural network learning theory, for example either the Bayesian a posteriori probability distribution does not converge to the Gaussian distribution, or the generalization error is not in proportion to the number of parameters. The purpose of this paper is to overcome this problem and to clarify the relation between the learning curve of a hierarchical learning machine and the algebraic geometrical structure of the parameter space. We establish an algorithm to calculate the Bayesian stochastic complexity based on blowing-up technology in algebraic geometry and prove that the Bayesian generalization error of a hierarchical learning machine is smaller than that of a regular statistical model, even if the true distribution is not contained in the parametric model.