A Classification EM algorithm for clustering and two stochastic versions
Computational Statistics & Data Analysis - Special issue on optimization techniques in statistics
A view of the EM algorithm that justifies incremental, sparse, and other variants
Learning in graphical models
Mixtures of probabilistic principal component analyzers
Neural Computation
Unsupervised Learning of Finite Mixture Models
IEEE Transactions on Pattern Analysis and Machine Intelligence
Assessing a Mixture Model for Clustering with the Integrated Completed Likelihood
IEEE Transactions on Pattern Analysis and Machine Intelligence
Modelling high-dimensional data by mixtures of factor analyzers
Computational Statistics & Data Analysis
A General Method for Scaling Up Machine Learning Algorithms and its Application to Clustering
ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Maintaining variance and k-medians over data stream windows
Proceedings of the twenty-second ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
FOCS '00 Proceedings of the 41st Annual Symposium on Foundations of Computer Science
Streaming-Data Algorithms for High-Quality Clustering
ICDE '02 Proceedings of the 18th International Conference on Data Engineering
ACM SIGMOD Record
SMEM Algorithm for Mixture Models
Neural Computation
An online classification EM algorithm based on the mixture model
Statistics and Computing
A framework for projected clustering of high dimensional data streams
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Parsimonious Gaussian mixture models
Statistics and Computing
Computational Statistics & Data Analysis
IEEE Transactions on Pattern Analysis and Machine Intelligence
Simultaneous model-based clustering and visualization in the Fisher discriminative subspace
Statistics and Computing
Fast ML Estimation for the Mixture of Factor Analyzers via an ECM Algorithm
IEEE Transactions on Neural Networks
An efficient ECM algorithm for maximum likelihood estimation in mixtures of t-factor analyzers
Computational Statistics
Hi-index | 0.00 |
Model-based clustering is a popular tool which is renowned for its probabilistic foundations and its flexibility. However, model-based clustering techniques usually perform poorly when dealing with high-dimensional data streams, which are nowadays a frequent data type. To overcome this limitation of model-based clustering, we propose an online inference algorithm for the mixture of probabilistic PCA model. The proposed algorithm relies on an EM-based procedure and on a probabilistic and incremental version of PCA. Model selection is also considered in the online setting through parallel computing. Numerical experiments on simulated and real data demonstrate the effectiveness of our approach and compare it to state-of-the-art online EM-based algorithms.