A view of the EM algorithm that justifies incremental, sparse, and other variants
Proceedings of the NATO Advanced Study Institute on Learning in graphical models
Top 10 algorithms in data mining
Knowledge and Information Systems
Fast Parallel Expectation Maximization for Gaussian Mixture Models on GPUs Using CUDA
HPCC '09 Proceedings of the 2009 11th IEEE International Conference on High Performance Computing and Communications
Online EM for unsupervised models
NAACL '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Computer Architecture, Fifth Edition: A Quantitative Approach
Computer Architecture, Fifth Edition: A Quantitative Approach
INCONCO: interpretable clustering of numerical and categorical objects
Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
Distributed EM algorithms for density estimation and clustering in sensor networks
IEEE Transactions on Signal Processing
Gossip-Based Computation of a Gaussian Mixture Model for Distributed Multimedia Indexing
IEEE Transactions on Multimedia
Distributed EM Algorithm for Gaussian Mixtures in Sensor Networks
IEEE Transactions on Neural Networks
Hi-index | 0.00 |
Composed of several hundreds of processors, the Graphics Processing Unit (GPU) has become a very interesting platform for computationally demanding tasks on massive data. A special hierarchy of processors and fast memory units allow very powerful and efficient parallelization but also demands novel parallel algorithms. Expectation Maximization (EM) is a widely used technique for maximum likelihood estimation. In this paper, we propose an innovative EM clustering algorithm particularly suited for the GPU platform on NVIDIA's Fermi architecture. The central idea of our algorithm is to allow the parallel threads exchanging their local information in an asynchronous way and thus updating their cluster representatives on demand by a technique called Asynchronous Model Updates (Async-EM). Async-EM enables our algorithm not only to accelerate convergence but also to reduce the overhead induced by memory bandwidth limitations and synchronization requirements. We demonstrate (1) how to reformulate the EM algorithm to be able to exchange information using Async-EM and (2) how to exploit the special memory and processor architecture of a modern GPU in order to share this information among threads in an optimal way. As a perspective Async-EM is not limited to EM but can be applied to a variety of algorithms.