Learning a mixture model for clustering with the completed likelihood minimum message length criterion

Authors:
Hong Zeng;Yiu-Ming Cheung
Affiliations:
-;-
Venue:
Pattern Recognition
Year:
2014

Citing 12
Cited 0

Vector quantization and signal compression

Vector quantization and signal compression
Unsupervised Learning of Finite Mixture Models

IEEE Transactions on Pattern Analysis and Machine Intelligence
Stochastic Complexity in Statistical Inquiry Theory

Stochastic Complexity in Statistical Inquiry Theory
Assessing a Mixture Model for Clustering with the Integrated Completed Likelihood

IEEE Transactions on Pattern Analysis and Machine Intelligence
On Fitting Mixture Models

EMMCVPR '99 Proceedings of the Second International Workshop on Energy Minimization Methods in Computer Vision and Pattern Recognition
Maximum Weighted Likelihood via Rival Penalized EM for Density Mixture Clustering with Automatic Model Selection

IEEE Transactions on Knowledge and Data Engineering
Visual Learning Given Sparse Data of Unknown Complexity

ICCV '05 Proceedings of the Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1 - Volume 01
Generalized competitive learning of Gaussian mixture models

IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics - Special issue on cybernetics and cognitive informatics
Learning mixture models via component-wise parameter smoothing

Computational Statistics & Data Analysis
Regularized parameter estimation in high-dimensional gaussian mixture models

Neural Computation
Feature Selection and Kernel Learning for Local Learning-Based Clustering

IEEE Transactions on Pattern Analysis and Machine Intelligence
Semi-Supervised Maximum Margin Clustering with Pairwise Constraints

IEEE Transactions on Knowledge and Data Engineering

Quantified Score

Hi-index	0.01

Visualization

Abstract

Mixture model based clustering (also simply called model-based clustering hereinafter) consists of fitting a mixture model to data and identifying each cluster with one of its components. This paper tackles the model selection and parameter estimation problems in model-based clustering so as to improve the clustering performance on the data sets whose true kernel distribution functions are not in the family of assumed ones, as well as with inherently overlapped clusters. Being tailored to clustering applications, an effective model selection criterion is first proposed. Unlike most criteria that measure the goodness-of-fit of the model only to generate data, the proposed one also evaluates whether the candidate model provides a reasonable partition for the observed data, which enforces a model with well-separated components. Accordingly, an improved method for the estimation of mixture parameters is derived, which aims to suppress the spurious estimates by the standard expectation maximization (EM) algorithm and enforce well-supported components in the mixture model. Finally, the estimation of mixture parameters and the model selection is integrated in a single algorithm which favors a compact mixture model with both the well-supported and well-separated components. Extensive experiments on synthetic and real-world data sets are carried out to show the effectiveness of the proposed approach to the mixture model based clustering.