Using conditional independence for parsimonious model-based Gaussian clustering

Authors:
Giuliano Galimberti;Gabriele Soffritti
Affiliations:
Department of Statistics, University of Bologna, Bologna, Italy 40126;Department of Statistics, University of Bologna, Bologna, Italy 40126
Venue:
Statistics and Computing
Year:
2013

Citing 27
Cited 0

Mixtures of probabilistic principal component analyzers

Neural Computation
A general class of multivariate skew-elliptical distributions

Journal of Multivariate Analysis
Robust mixture modelling using the t distribution

Statistics and Computing
Editorial: recent developments in mixture models

Computational Statistics & Data Analysis
Modelling high-dimensional data by mixtures of factor analyzers

Computational Statistics & Data Analysis
Fitting of mixtures with unspecified number of components using cross validation distance estimate

Computational Statistics & Data Analysis
Choosing starting values for the EM algorithm for getting the highest likelihood in multivariate Gaussian mixture models

Computational Statistics & Data Analysis
Mixtures of Factor Analyzers

ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
A Mixed Factors Model for Dimension Reduction and Extraction of a Group Structure in Gene Expression Data

CSB '04 Proceedings of the 2004 IEEE Computational Systems Bioinformatics Conference
Enhanced Model-Based Clustering, Density Estimation,and Discriminant Analysis Software: MCLUST

Journal of Classification
Editorial: Advances in Mixture Models

Computational Statistics & Data Analysis
Extension of the mixture of factor analyzers model to incorporate the multivariate t-distribution

Computational Statistics & Data Analysis
Robust mixture modeling using the skew t distribution

Statistics and Computing
Parsimonious Gaussian mixture models

Statistics and Computing
Maximum likelihood estimation for multivariate skew normal mixture models

Journal of Multivariate Analysis
Model-based clustering with non-elliptically contoured distributions

Statistics and Computing
Variable selection in model-based clustering: A general variable role modeling

Computational Statistics & Data Analysis
Penalized factor mixture analysis for variable selection in clustered data

Computational Statistics & Data Analysis
Evaluating latent class analysis models in qualitative phenotype identification

Computational Statistics & Data Analysis
Model-based cluster and discriminant analysis with the MIXMOD software

Computational Statistics & Data Analysis
Serial and parallel implementations of model-based clustering via parsimonious Gaussian mixture models

Computational Statistics & Data Analysis
Multivariate Skew t Mixture Models: Applications to Fluorescence-Activated Cell Sorting Data

DICTA '09 Proceedings of the 2009 Digital Image Computing: Techniques and Applications
Robust mixture modeling using multivariate skew t distributions

Statistics and Computing
Mixtures of Factor Analyzers with Common Factor Loadings: Applications to the Clustering and Visualization of High-Dimensional Data

IEEE Transactions on Pattern Analysis and Machine Intelligence
Robust mixture modeling based on scale mixtures of skew-normal distributions

Computational Statistics & Data Analysis
Dimensionally Reduced Model-Based Clustering Through Mixtures of Factor Mixture Analyzers

Journal of Classification
Initializing the EM algorithm in Gaussian mixture models with an unknown number of components

Computational Statistics & Data Analysis

Quantified Score

Hi-index	0.00

Visualization

Abstract

In the framework of model-based cluster analysis, finite mixtures of Gaussian components represent an important class of statistical models widely employed for dealing with quantitative variables. Within this class, we propose novel models in which constraints on the component-specific variance matrices allow us to define Gaussian parsimonious clustering models. Specifically, the proposed models are obtained by assuming that the variables can be partitioned into groups resulting to be conditionally independent within components, thus producing component-specific variance matrices with a block diagonal structure. This approach allows us to extend the methods for model-based cluster analysis and to make them more flexible and versatile. In this paper, Gaussian mixture models are studied under the above mentioned assumption. Identifiability conditions are proved and the model parameters are estimated through the maximum likelihood method by using the Expectation-Maximization algorithm. The Bayesian information criterion is proposed for selecting the partition of the variables into conditionally independent groups. The consistency of the use of this criterion is proved under regularity conditions. In order to examine and compare models with different partitions of the set of variables a hierarchical algorithm is suggested. A wide class of parsimonious Gaussian models is also presented by parameterizing the component-variance matrices according to their spectral decomposition. The effectiveness and usefulness of the proposed methodology are illustrated with two examples based on real datasets.