Bayesian hierarchical mixtures of experts

Authors:
Christopher M. Bishop;Markus Svenskn
Affiliations:
Microsoft Research, Cambridge, U.K.;Microsoft Research, Cambridge, U.K.
Venue:
UAI'03 Proceedings of the Nineteenth conference on Uncertainty in Artificial Intelligence
Year:
2002

Citing 5
Cited 13

Hierarchical mixtures of experts and the EM algorithm

Neural Computation
An introduction to variational methods for graphical models

Learning in graphical models
Neural Networks for Pattern Recognition

Neural Networks for Pattern Recognition
Bayesian parameter estimation via variational methods

Statistics and Computing
Bayesian model search for mixture models based on optimizing variational bounds

Neural Networks

A principled foundation for LCS

Proceedings of the 9th annual conference companion on Genetic and evolutionary computation
BM3E: Discriminative Density Propagation for Visual Tracking

IEEE Transactions on Pattern Analysis and Machine Intelligence
A Principled Foundation for LCS

Learning Classifier Systems
Refined experts: improving classification in large taxonomies

Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
Parsimonious reduction of Gaussian mixture models with a variational-Bayes approach

Pattern Recognition
Twin Gaussian Processes for Structured Prediction

International Journal of Computer Vision
Logistic Stick-Breaking Process

The Journal of Machine Learning Research
Divergence measures and a general framework for local variational approximation

Neural Networks
A mixture of experts approach to multi-strategy image quality assessment

ICIAR'12 Proceedings of the 9th international conference on Image Analysis and Recognition - Volume Part I
A model-learner pattern for bayesian reasoning

POPL '13 Proceedings of the 40th annual ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Activity recognition with android phone using mixture-of-experts co-trained with labeled and unlabeled data

Neurocomputing
Mixtures of Gaussian process models for human pose estimation

Image and Vision Computing
Embedded local feature selection within mixture of experts

Information Sciences: an International Journal

Quantified Score

Hi-index	0.00

Visualization

Abstract

The Hierarchical Mixture of Experts (HME) is a well-known tree-structured model for regression and classification, based on soft probabilistic splits of the input space. In its original formulation its parameters are determined by maximum likelihood, which is prone to severe overfitting, including singularities in the likelihood function. Furthermore the maximum likelihood framework offers no natural metric for optimizing the complexity and structure of the tree. Previous attempts to provide a Bayesian treatment of the HME model have relied either on local Gaussian representations based on the Laplace approximation, or have modified the model so that it represents the joint distribution of both input and output variables, which can be wasteful of resources if the goal is prediction. In this paper we describe a fully Bayesian treatment of the original HME model based on variational inference. By combining 'local' and 'global' variational methods we obtain a rigorous lower bound on the marginal probability of the data under the model. This bound is optimized during the training phase, and its resulting value can be used for model order selection. We present results using this approach for data sets describing robot arm kinematics.