Algorithms for clustering data
Algorithms for clustering data
Bayesian parameter estimation via variational methods
Statistics and Computing
Data Mining: Concepts and Techniques
Data Mining: Concepts and Techniques
Pattern Recognition and Machine Learning (Information Science and Statistics)
Pattern Recognition and Machine Learning (Information Science and Statistics)
Neural Networks
Variational approximations in Bayesian model selection for finite mixture distributions
Computational Statistics & Data Analysis
SMEM Algorithm for Mixture Models
Neural Computation
Robust Bayesian mixture modelling
Neurocomputing
Inferring parameters and structure of latent variable models by variational bayes
UAI'99 Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence
Unsupervised Learning of Gaussian Mixtures Based on Variational Component Splitting
IEEE Transactions on Neural Networks
Hi-index | 0.01 |
A new variational Bayesian (VB) algorithm, split and eliminate VB (SEVB), for modeling data via a Gaussian mixture model (GMM) is developed. This new algorithm makes use of component splitting in a way that is more appropriate for analyzing a large number of highly heterogeneous spiky spatial patterns with weak prior information than existing VB-based approaches. SEVB is a highly computationally efficient approach to Bayesian inference and like any VB-based algorithm it can perform model selection and parameter value estimation simultaneously. A significant feature of our algorithm is that the fitted number of components is not limited by the initial proposal giving increased modeling flexibility. We introduce two types of split operation in addition to proposing a new goodness-of-fit measure for evaluating mixture models. We evaluate their usefulness through empirical studies. In addition, we illustrate the utility of our new approach in an application on modeling human mobility patterns. This application involves large volumes of highly heterogeneous spiky data; it is difficult to model this type of data well using the standard VB approach as it is too restrictive and lacking in the flexibility required. Empirical results suggest that our algorithm has also improved upon the goodness-of-fit that would have been achieved using the standard VB method, and that it is also more robust to various initialization settings.