Hierarchical mixtures of experts and the EM algorithm
Neural Computation
An introduction to variational methods for graphical models
Learning in graphical models
Neural Networks for Pattern Recognition
Neural Networks for Pattern Recognition
Bayesian parameter estimation via variational methods
Statistics and Computing
A principled foundation for LCS
Proceedings of the 9th annual conference companion on Genetic and evolutionary computation
BM3E: Discriminative Density Propagation for Visual Tracking
IEEE Transactions on Pattern Analysis and Machine Intelligence
A Principled Foundation for LCS
Learning Classifier Systems
Refined experts: improving classification in large taxonomies
Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
Twin Gaussian Processes for Structured Prediction
International Journal of Computer Vision
Logistic Stick-Breaking Process
The Journal of Machine Learning Research
A mixture of experts approach to multi-strategy image quality assessment
ICIAR'12 Proceedings of the 9th international conference on Image Analysis and Recognition - Volume Part I
A model-learner pattern for bayesian reasoning
POPL '13 Proceedings of the 40th annual ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Mixtures of Gaussian process models for human pose estimation
Image and Vision Computing
Embedded local feature selection within mixture of experts
Information Sciences: an International Journal
Hi-index | 0.00 |
The Hierarchical Mixture of Experts (HME) is a well-known tree-structured model for regression and classification, based on soft probabilistic splits of the input space. In its original formulation its parameters are determined by maximum likelihood, which is prone to severe overfitting, including singularities in the likelihood function. Furthermore the maximum likelihood framework offers no natural metric for optimizing the complexity and structure of the tree. Previous attempts to provide a Bayesian treatment of the HME model have relied either on local Gaussian representations based on the Laplace approximation, or have modified the model so that it represents the joint distribution of both input and output variables, which can be wasteful of resources if the goal is prediction. In this paper we describe a fully Bayesian treatment of the original HME model based on variational inference. By combining 'local' and 'global' variational methods we obtain a rigorous lower bound on the marginal probability of the data under the model. This bound is optimized during the training phase, and its resulting value can be used for model order selection. We present results using this approach for data sets describing robot arm kinematics.