Automated hierarchical mixtures of probabilistic principal component analyzers

Authors:
Ting Su;Jennifer G. Dy
Affiliations:
Northeastern University, Boston, MA;Northeastern University, Boston, MA
Venue:
ICML '04 Proceedings of the twenty-first international conference on Machine learning
Year:
2004

Citing 9
Cited 1

A Hierarchical Latent Variable Model for Data Visualization

IEEE Transactions on Pattern Analysis and Machine Intelligence
Mixtures of probabilistic principal component analyzers

Neural Computation
Bayesian PCA

Proceedings of the 1998 conference on Advances in neural information processing systems II
Unsupervised Feature Selection Using Feature Similarity

IEEE Transactions on Pattern Analysis and Machine Intelligence
Hierarchical GTM: Constructing Localized Nonlinear Projection Manifolds in a Principled Way

IEEE Transactions on Pattern Analysis and Machine Intelligence
Assessing a Mixture Model for Clustering with the Integrated Completed Likelihood

IEEE Transactions on Pattern Analysis and Machine Intelligence
Feature Subset Selection and Order Identification for Unsupervised Learning

ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
X-means: Extending K-means with Efficient Estimation of the Number of Clusters

ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Cluster ensembles --- a knowledge reuse framework for combining multiple partitions

The Journal of Machine Learning Research

On cluster tree for nested and multi-density data clustering

Pattern Recognition

Quantified Score

Hi-index	0.00

Visualization

Abstract

Many clustering algorithms fail when dealing with high dimensional data. Principal component analysis (PCA) is a popular dimensionality reduction algorithm. However, it assumes a single multivariate Gaussian model, which provides a global linear projection of the data. Mixture of probabilistic principal component analyzers (PPCA) provides a better model to the clustering paradigm. It provides a local linear PCA projection for each multivariate Gaussian cluster component. We extend this model to build hierarchical mixtures of PPCA. Hierarchical clustering provides a flexible representation showing relationships among clusters in various perceptual levels. We introduce an automated hierarchical mixture of PPCA algorithm, which utilizes the integrated classification likelihood as a criterion for splitting and stopping the addition of hierarchical levels. An automated approach requires automated methods for initialization, determining the number of principal component dimensions, and determining when to split clusters. We address each of these in the paper. This automated approach results in a coarse to fine local component model with varying projections and with different number of dimensions for each cluster.