A spectral algorithm for learning mixture models

Authors:
Santosh Vempala;Grant Wang
Affiliations:
Department of Mathematics, MIT, Cambridge, MA;Laboratory for Computer Science, MIT, Cambridge, MA
Venue:
Journal of Computer and System Sciences - Special issue on FOCS 2002
Year:
2004

Citing 8
Cited 24

Clustering in large graphs and matrices

Proceedings of the tenth annual ACM-SIAM symposium on Discrete algorithms
Latent semantic indexing: a probabilistic analysis

Journal of Computer and System Sciences - Special issue on the seventeenth ACM SIGACT-SIGMOD-SIGART symposium on principles of database systems
Learning mixtures of arbitrary gaussians

STOC '01 Proceedings of the thirty-third annual ACM symposium on Theory of computing
Spectral analysis of data

STOC '01 Proceedings of the thirty-third annual ACM symposium on Theory of computing
A Two-Round Variant of EM for Gaussian Mixtures

UAI '00 Proceedings of the 16th Conference on Uncertainty in Artificial Intelligence
Fast Monte-Carlo Algorithms for finding low-rank approximations

FOCS '98 Proceedings of the 39th Annual Symposium on Foundations of Computer Science
Learning Mixtures of Gaussians

FOCS '99 Proceedings of the 40th Annual Symposium on Foundations of Computer Science
Logconcave Functions: Geometry and Efficient Sampling Algorithms

FOCS '03 Proceedings of the 44th Annual IEEE Symposium on Foundations of Computer Science

On Learning Mixtures of Heavy-Tailed Distributions

FOCS '05 Proceedings of the 46th Annual IEEE Symposium on Foundations of Computer Science
The uniqueness of a good optimum for K-means

ICML '06 Proceedings of the 23rd international conference on Machine learning
An investigation of computational and informational limits in Gaussian mixture clustering

ICML '06 Proceedings of the 23rd international conference on Machine learning
The random projection method in goodness of fit for functional data

Computational Statistics & Data Analysis
A Probabilistic Analysis of EM for Mixtures of Separated, Spherical Gaussians

The Journal of Machine Learning Research
Smooth sensitivity and sampling in private data analysis

Proceedings of the thirty-ninth annual ACM symposium on Theory of computing
Spectral clustering with limited independence

SODA '07 Proceedings of the eighteenth annual ACM-SIAM symposium on Discrete algorithms
A discriminative framework for clustering via similarity functions

STOC '08 Proceedings of the fortieth annual ACM symposium on Theory of computing
Multiple Pass Streaming Algorithms for Learning Mixtures of Distributions in ${\mathbb R}^d$

ALT '07 Proceedings of the 18th international conference on Algorithmic Learning Theory
Clustering with Interactive Feedback

ALT '08 Proceedings of the 19th international conference on Algorithmic Learning Theory
Approximate clustering without the approximation

SODA '09 Proceedings of the twentieth Annual ACM-SIAM Symposium on Discrete Algorithms
Robust PCA and clustering in noisy mixtures

SODA '09 Proceedings of the twentieth Annual ACM-SIAM Symposium on Discrete Algorithms
Multiple pass streaming algorithms for learning mixtures of distributions in Rd

Theoretical Computer Science
Graph characteristics from the heat kernel trace

Pattern Recognition
Spectral Algorithms

Foundations and Trends® in Theoretical Computer Science
Are there local maxima in the infinite-sample likelihood of Gaussian mixture estimation?

COLT'07 Proceedings of the 20th annual conference on Learning theory
Spectral methods for matrices and tensors

Proceedings of the forty-second ACM symposium on Theory of computing
Efficiently learning mixtures of two Gaussians

Proceedings of the forty-second ACM symposium on Theory of computing
Disentangling Gaussians

Communications of the ACM
The spectral method for general mixture models

COLT'05 Proceedings of the 18th annual conference on Learning Theory
Effective principal component analysis

SISAP'12 Proceedings of the 5th international conference on Similarity Search and Applications
Spectral learning of latent-variable PCFGs

ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers - Volume 1
Clustering under approximation stability

Journal of the ACM (JACM)
Learning mixtures of arbitrary distributions over large discrete domains

Proceedings of the 5th conference on Innovations in theoretical computer science

Quantified Score

Hi-index	0.02

Visualization

Abstract

We show that a simple spectral algorithm for learning a mixture of k spherical Gaussians in Rn works remarkably well--it succeeds in identifying the Gaussians assuming essentially the minimum possible separation between their centers that keeps them unique (solving an open problem of Arora and Kannan (Proceedings of the 33rd ACM STOC, 2001). The sample complexity and running time are polynomial in both n and k. The algorithm can be applied to the more general problem of learning a mixture of "weakly isotropic" distributions (e.g. a mixture of uniform distributions on cubes).