Integration of Stochastic Models by Minimizing α-Divergence

Authors:
Shun-ichi Amari
Affiliations:
RIKEN Brain Science Institute, Wako-shi, Hirosawa 2-1, Saitama 351-0198, Japan amari@brain.riken.jp
Venue:
Neural Computation
Year:
2007

Citing 11
Cited 11

Hierarchical mixtures of experts and the EM algorithm

Neural Computation
The Helmholtz machine

Neural Computation
Multiple paired forward and inverse models for motor control

Neural Networks - Special issue on neural control and robotics: biology and technology
Training products of experts by minimizing contrastive divergence

Neural Computation
Robust blind source separation by beta divergence

Neural Computation
Divergence function, duality, and convex analysis

Neural Computation
Information geometry of U-Boost and Bregman divergence

Neural Computation
Stochastic reasoning, free energy, and information geometry

Neural Computation
Means of Positive Numbers and Matrices

SIAM Journal on Matrix Analysis and Applications
The α-EM algorithm: surrogate likelihood maximization using α-logarithmic information measures

IEEE Transactions on Information Theory
Advances on BYY harmony learning: information theoretic perspective, generalized projection geometry, and independent factor autodetermination

IEEE Transactions on Neural Networks

Non-negative matrix factorization with α-divergence

Pattern Recognition Letters
α-Gaussian mixture modelling for speaker recognition

Pattern Recognition Letters
Parameter estimation for α-gmm based on maximum likelihood criterion

Neural Computation
Sided and symmetrized Bregman centroids

IEEE Transactions on Information Theory
α-divergence is unique, belonging to both f-divergence and Bregman divergence classes

IEEE Transactions on Information Theory
Manifold alpha-integration

PRICAI'10 Proceedings of the 11th Pacific Rim international conference on Trends in artificial intelligence
A generalization of independence in naive bayes model

IDEAL'10 Proceedings of the 11th international conference on Intelligent data engineering and automated learning
An estimation of generalized bradley-terry models based on the em algorithm

Neural Computation
Unsupervised Weight Parameter Estimation Method for Ensemble Learning

Journal of Mathematical Modelling and Algorithms
Sequential spectral learning to hash with multiple representations

ECCV'12 Proceedings of the 12th European conference on Computer Vision - Volume Part V
Review: Divergence measures for statistical data processing-An annotated bibliography

Signal Processing

Quantified Score

Hi-index	0.12

Visualization

Abstract

When there are a number of stochastic models in the form of probability distributions, one needs to integrate them. Mixtures of distributions are frequently used, but exponential mixtures also provide a good means of integration. This letter proposes a one-parameter family of integration, called α-integration, which includes all of these well-known integrations. These are generalizations of various averages of numbers such as arithmetic, geometric, and harmonic averages. There are psychophysical experiments that suggest that α-integrations are used in the brain. The α-divergence between two distributions is defined, which is a natural generalization of Kullback-Leibler divergence and Hellinger distance, and it is proved that α-integration is optimal in the sense of minimizing α-divergence. The theory is applied to generalize the mixture of experts and the product of experts to the α-mixture of experts. The α-predictive distribution is also stated in the Bayesian framework.