Hierarchical mixtures-of-experts for exponential family regression models with generalized linear mean functions: a survey of approximation and consistency results

Authors:
Wenxin Jiang;Martin A. Tanner
Affiliations:
Department of Statistics, Northwestern University, Evanston, IL;Department of Statistics, Northwestern University, Evanston, IL
Venue:
UAI'98 Proceedings of the Fourteenth conference on Uncertainty in artificial intelligence
Year:
1998

Citing 8
Cited 0

Hierarchical mixtures of experts and the EM algorithm

Neural Computation
Convergence results for the EM approach to mixtures of experts architectures

Neural Networks
Improving the mean field approximation via the use of mixture distributions

Learning in graphical models
Neural Networks for Pattern Recognition

Neural Networks for Pattern Recognition
Learning Fine Motion by Markov Mixtures of Experts

Learning Fine Motion by Markov Mixtures of Experts
Adaptive mixtures of local experts

Neural Computation
Neural networks for optimal approximation of smooth and analytic functions

Neural Computation
Error bounds for functional approximation and estimation using mixtures of experts

IEEE Transactions on Information Theory

Quantified Score

Hi-index	0.00

Visualization

Abstract

We investigate a class of hierarchical mixtures-of-experts (HME) models where exponential family regression models with generalized linear mean functions of the form ψ(α + xT β) are mixed. Here ψ(ċ) is the inverse link function. Suppose the true response y follows an exponential family regression model with mean function belonging to a class of smooth functions of the form ψ(h(x)) where h(ċ) ∈ W∞2 (a Sobolev class over [0, 1]s). It is shown that the HME probability density functions can approximate the true density, at a rate of O(m-2/s) in Lp, norm, and at a rate of O(m-4/s) in Kullback-Leibler divergence. These rates can be achieved within the family of HME structures with no more than s-layers, where s is the dimension of the predictor x. It is also shown that likelihood-based inference based on HME is consistent in recovering the truth, in the sense that as the sample size n and the number of experts m both increase, the mean square error of the predicted mean response goes to zero. Conditions for such results to hold are stated and discussed.