Algebraic geometrical methods for hierarchical learning machines

Authors:
Sumio Watanabe
Affiliations:
Tokyo Institute of Technology, Precision & Intelligence Laboratory, 4259 Nagatsuta, Midori-ku, Yokohama, 226-8503 Japan
Venue:
Neural Networks
Year:
2001

Citing 13
Cited 19

Bayesian interpolation

Neural Computation
Four types of learning curves

Neural Computation
A universal theorem on learning curves

Neural Networks
An introduction to wavelets

An introduction to wavelets
Statistical theory of learning curves under entropic loss criterion

Neural Computation
Approximation and Estimation Bounds for Artificial Neural Networks

Machine Learning - Special issue on computational learning theory
On the Stochastic Complexity of Learning Realizable and Unrealizable Rules

Machine Learning
A regularity condition of the information matrix of a multilayer perceptron network

Neural Networks
An integral representation of functions using three-layered networks and their approximation bounds

Neural Networks
Algebraic Analysis for Nonidentifiable Learning Machines

Neural Computation
A decision-theoretic extension of stochastic complexity and its applications to learning

IEEE Transactions on Information Theory
Learning efficiency of redundant neural networks in Bayesian estimation

IEEE Transactions on Neural Networks
Probabilistic design of layered neural networks based on their unified framework

IEEE Transactions on Neural Networks

Learning coefficients of layered models when the true distribution mismatches the singularities

Neural Computation
Singularities in mixture models and upper bounds of stochastic complexity

Neural Networks
Difficulty of Singularity in Population Coding

Neural Computation
Singularities Affect Dynamics of Learning in Neuromanifolds

Neural Computation
Stochastic complexities of reduced rank regression in Bayesian estimation

Neural Networks
Dynamics of learning near singularities in layered networks

Neural Computation
Asymptotic behavior of exchange ratio in exchange Monte Carlo method

Neural Networks
Resolution of Singularities and Stochastic Complexity of Complete Bipartite Graph-Type Spin Model in Bayesian Estimation

MDAI '07 Proceedings of the 4th international conference on Modeling Decisions for Artificial Intelligence
Singularity and Slow Convergence of the EM algorithm for Gaussian Mixtures

Neural Processing Letters
Accuracy of Loopy belief propagation in Gaussian models

Neural Networks
Equations of states in singular statistical estimation

Neural Networks
Algebraic geometry and stochastic complexity of hidden Markov models

Neurocomputing
Estimation of poles of zeta function in learning theory using Padé approximation

ICANN'07 Proceedings of the 17th international conference on Artificial neural networks
Experimental study of ergodic learning curve in hidden Markov models

ICONIP'08 Proceedings of the 15th international conference on Advances in neuro-information processing - Volume Part I
Stochastic Complexity and Generalization Error of a Restricted Boltzmann Machine in Bayesian Estimation

The Journal of Machine Learning Research
Asymptotic Equivalence of Bayes Cross Validation and Widely Applicable Information Criterion in Singular Learning Theory

The Journal of Machine Learning Research
Stochastic complexity of bayesian networks

UAI'03 Proceedings of the Nineteenth conference on Uncertainty in Artificial Intelligence
Learning coefficient of generalization error in bayesian estimation and vandermonde matrix-type singularity

Neural Computation
A widely applicable Bayesian information criterion

The Journal of Machine Learning Research

Quantified Score

Hi-index	0.00

Visualization

Abstract

Hierarchical learning machines such as layered perceptrons, radial basis functions, Gaussian mixtures are non-identifiable learning machines, whose Fisher information matrices are not positive definite. This fact shows that conventional statistical asymptotic theory cannot be applied to neural network learning theory, for example either the Bayesian a posteriori probability distribution does not converge to the Gaussian distribution, or the generalization error is not in proportion to the number of parameters. The purpose of this paper is to overcome this problem and to clarify the relation between the learning curve of a hierarchical learning machine and the algebraic geometrical structure of the parameter space. We establish an algorithm to calculate the Bayesian stochastic complexity based on blowing-up technology in algebraic geometry and prove that the Bayesian generalization error of a hierarchical learning machine is smaller than that of a regular statistical model, even if the true distribution is not contained in the parametric model.