Algebraic geometrical methods for hierarchical learning machines
Neural Networks
Algebraic Analysis for Nonidentifiable Learning Machines
Neural Computation
Singularities Affect Dynamics of Learning in Neuromanifolds
Neural Computation
Learning efficiency of redundant neural networks in Bayesian estimation
IEEE Transactions on Neural Networks
IEEE Transactions on Neural Networks
Bayesian joint optimization for topic model and clustering
ICANN'10 Proceedings of the 20th international conference on Artificial neural networks: Part I
High dimensional non-linear modeling with Bayesian mixture of CCA
ICONIP'10 Proceedings of the 17th international conference on Neural information processing: theory and algorithms - Volume Part I
The Journal of Machine Learning Research
Hi-index | 0.00 |
Learning machines that have hierarchical structures or hidden variables are singular statistical models because they are nonidentifiable and their Fisher information matrices are singular. In singular statistical models, neither does the Bayes a posteriori distribution converge to the normal distribution nor does the maximum likelihood estimator satisfy asymptotic normality. This is the main reason that it has been difficult to predict their generalization performance from trained states. In this paper, we study four errors, (1) the Bayes generalization error, (2) the Bayes training error, (3) the Gibbs generalization error, and (4) the Gibbs training error, and prove that there are universal mathematical relations among these errors. The formulas proved in this paper are equations of states in statistical estimation because they hold for any true distribution, any parametric model, and any a priori distribution. Also we show that the Bayes and Gibbs generalization errors can be estimated by Bayes and Gibbs training errors, and we propose widely applicable information criteria that can be applied to both regular and singular statistical models.