Clustering in large graphs and matrices
Proceedings of the tenth annual ACM-SIAM symposium on Discrete algorithms
Latent semantic indexing: a probabilistic analysis
Journal of Computer and System Sciences - Special issue on the seventeenth ACM SIGACT-SIGMOD-SIGART symposium on principles of database systems
Learning mixtures of arbitrary gaussians
STOC '01 Proceedings of the thirty-third annual ACM symposium on Theory of computing
STOC '01 Proceedings of the thirty-third annual ACM symposium on Theory of computing
A Two-Round Variant of EM for Gaussian Mixtures
UAI '00 Proceedings of the 16th Conference on Uncertainty in Artificial Intelligence
Fast Monte-Carlo Algorithms for finding low-rank approximations
FOCS '98 Proceedings of the 39th Annual Symposium on Foundations of Computer Science
Learning Mixtures of Gaussians
FOCS '99 Proceedings of the 40th Annual Symposium on Foundations of Computer Science
Logconcave Functions: Geometry and Efficient Sampling Algorithms
FOCS '03 Proceedings of the 44th Annual IEEE Symposium on Foundations of Computer Science
On Learning Mixtures of Heavy-Tailed Distributions
FOCS '05 Proceedings of the 46th Annual IEEE Symposium on Foundations of Computer Science
The uniqueness of a good optimum for K-means
ICML '06 Proceedings of the 23rd international conference on Machine learning
An investigation of computational and informational limits in Gaussian mixture clustering
ICML '06 Proceedings of the 23rd international conference on Machine learning
The random projection method in goodness of fit for functional data
Computational Statistics & Data Analysis
A Probabilistic Analysis of EM for Mixtures of Separated, Spherical Gaussians
The Journal of Machine Learning Research
Smooth sensitivity and sampling in private data analysis
Proceedings of the thirty-ninth annual ACM symposium on Theory of computing
Spectral clustering with limited independence
SODA '07 Proceedings of the eighteenth annual ACM-SIAM symposium on Discrete algorithms
A discriminative framework for clustering via similarity functions
STOC '08 Proceedings of the fortieth annual ACM symposium on Theory of computing
Multiple Pass Streaming Algorithms for Learning Mixtures of Distributions in ${\mathbb R}^d$
ALT '07 Proceedings of the 18th international conference on Algorithmic Learning Theory
Clustering with Interactive Feedback
ALT '08 Proceedings of the 19th international conference on Algorithmic Learning Theory
Approximate clustering without the approximation
SODA '09 Proceedings of the twentieth Annual ACM-SIAM Symposium on Discrete Algorithms
Robust PCA and clustering in noisy mixtures
SODA '09 Proceedings of the twentieth Annual ACM-SIAM Symposium on Discrete Algorithms
Multiple pass streaming algorithms for learning mixtures of distributions in Rd
Theoretical Computer Science
Graph characteristics from the heat kernel trace
Pattern Recognition
Foundations and Trends® in Theoretical Computer Science
Are there local maxima in the infinite-sample likelihood of Gaussian mixture estimation?
COLT'07 Proceedings of the 20th annual conference on Learning theory
Spectral methods for matrices and tensors
Proceedings of the forty-second ACM symposium on Theory of computing
Efficiently learning mixtures of two Gaussians
Proceedings of the forty-second ACM symposium on Theory of computing
Communications of the ACM
The spectral method for general mixture models
COLT'05 Proceedings of the 18th annual conference on Learning Theory
Effective principal component analysis
SISAP'12 Proceedings of the 5th international conference on Similarity Search and Applications
Spectral learning of latent-variable PCFGs
ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers - Volume 1
Clustering under approximation stability
Journal of the ACM (JACM)
Learning mixtures of arbitrary distributions over large discrete domains
Proceedings of the 5th conference on Innovations in theoretical computer science
Hi-index | 0.02 |
We show that a simple spectral algorithm for learning a mixture of k spherical Gaussians in Rn works remarkably well--it succeeds in identifying the Gaussians assuming essentially the minimum possible separation between their centers that keeps them unique (solving an open problem of Arora and Kannan (Proceedings of the 33rd ACM STOC, 2001). The sample complexity and running time are polynomial in both n and k. The algorithm can be applied to the more general problem of learning a mixture of "weakly isotropic" distributions (e.g. a mixture of uniform distributions on cubes).