On spectral learning of mixtures of distributions

Authors:
Dimitris Achlioptas;Frank McSherry
Affiliations:
Microsoft Research, One Microsoft Way, Redmond, WA;Microsoft Research, Mountain View, CA
Venue:
COLT'05 Proceedings of the 18th annual conference on Learning Theory
Year:
2005

Citing 4
Cited 22

Learning mixtures of arbitrary gaussians

STOC '01 Proceedings of the thirty-third annual ACM symposium on Theory of computing
A Spectral Algorithm for Learning Mixtures of Distributions

FOCS '02 Proceedings of the 43rd Symposium on Foundations of Computer Science
Learning Mixtures of Gaussians

FOCS '99 Proceedings of the 40th Annual Symposium on Foundations of Computer Science
A two-round variant of EM for Gaussian mixtures

UAI'00 Proceedings of the Sixteenth conference on Uncertainty in artificial intelligence

On Learning Mixtures of Heavy-Tailed Distributions

FOCS '05 Proceedings of the 46th Annual IEEE Symposium on Foundations of Computer Science
An investigation of computational and informational limits in Gaussian mixture clustering

ICML '06 Proceedings of the 23rd international conference on Machine learning
A Probabilistic Analysis of EM for Mixtures of Separated, Spherical Gaussians

The Journal of Machine Learning Research
Spectral clustering by recursive partitioning

ESA'06 Proceedings of the 14th conference on Annual European Symposium - Volume 14
Spectral clustering with limited independence

SODA '07 Proceedings of the eighteenth annual ACM-SIAM symposium on Discrete algorithms
A rigorous analysis of population stratification with limited data

SODA '07 Proceedings of the eighteenth annual ACM-SIAM symposium on Discrete algorithms
A discriminative framework for clustering via similarity functions

STOC '08 Proceedings of the fortieth annual ACM symposium on Theory of computing
Clustering with Interactive Feedback

ALT '08 Proceedings of the 19th international conference on Algorithmic Learning Theory
Approximate clustering without the approximation

SODA '09 Proceedings of the twentieth Annual ACM-SIAM Symposium on Discrete Algorithms
Robust PCA and clustering in noisy mixtures

SODA '09 Proceedings of the twentieth Annual ACM-SIAM Symposium on Discrete Algorithms
Multi-view clustering via canonical correlation analysis

ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Baum's Algorithm Learns Intersections of Halfspaces with Respect to Log-Concave Distributions

APPROX '09 / RANDOM '09 Proceedings of the 12th International Workshop and 13th International Workshop on Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques
Spectral Algorithms

Foundations and Trends® in Theoretical Computer Science
Separating populations with wide data: a spectral analysis

ISAAC'07 Proceedings of the 18th international conference on Algorithms and computation
Spectral methods for matrices and tensors

Proceedings of the forty-second ACM symposium on Theory of computing
Efficiently learning mixtures of two Gaussians

Proceedings of the forty-second ACM symposium on Theory of computing
Disentangling Gaussians

Communications of the ACM
PAC learning axis-aligned mixtures of gaussians with no separation assumption

COLT'06 Proceedings of the 19th annual conference on Learning Theory
Effective principal component analysis

SISAP'12 Proceedings of the 5th international conference on Similarity Search and Applications
Clustering under approximation stability

Journal of the ACM (JACM)
Low rank approximation and regression in input sparsity time

Proceedings of the forty-fifth annual ACM symposium on Theory of computing
Learning mixtures of arbitrary distributions over large discrete domains

Proceedings of the 5th conference on Innovations in theoretical computer science

Quantified Score

Hi-index	0.03

Visualization

Abstract

We consider the problem of learning mixtures of distributions via spectral methods and derive a characterization of when such methods are useful. Specifically, given a mixture-sample, let $\bar\mu_{i}, {\bar C_{i}}, \bar w_{i}$ denote the empirical mean, covariance matrix, and mixing weight of the samples from the i-th component. We prove that a very simple algorithm, namely spectral projection followed by single-linkage clustering, properly classifies every point in the sample provided that each pair of means $\bar\mu_{i},\bar\mu_{j}$ is well separated, in the sense that $\|\bar\mu_{i} - \bar\mu_{j}\| We consider the problem of learning mixtures of distributions via spectral methods and derive a characterization of when such methods are useful. Specifically, given a mixture-sample, let $\bar\mu_{i}, {\bar C_{i}}, \bar w_{i}$ denote the empirical mean, covariance matrix, and mixing weight of the samples from the i-th component. We prove that a very simple algorithm, namely spectral projection followed by single-linkage clustering, properly classifies every point in the sample provided that each pair of means $\bar\mu_{i},\bar\mu_{j}$ is well separated, in the sense that $\|\bar\mu_{i} - \bar\mu_{j}\|^{2}$ is at least $\|{\bar C_{i}\|_{2}(1/\bar w_{i}+1/\bar w_{j})}$ plus a term that depends on the concentration properties of the distributions in the mixture. This second term is very small for many distributions, including Gaussians, Log-concave, and many others. As a result, we get the best known bounds for learning mixtures of arbitrary Gaussians in terms of the required mean separation. At the same time, we prove that there are many Gaussian mixtures {(μi ,Ci ,wi)} such that each pair of means is separated by ||Ci||2(1/wi+1/wj), yet upon spectral projection the mixture collapses completely, i.e., all means and covariance matrices in the projected mixture are identical. ${\bar C_{i}\|_{2}(1/\bar w_{i}+1/\bar w_{j})}$ plus a term that depends on the concentration properties of the distributions in the mixture. This second term is very small for many distributions, including Gaussians, Log-concave, and many others. As a result, we get the best known bounds for learning mixtures of arbitrary Gaussians in terms of the required mean separation. At the same time, we prove that there are many Gaussian mixtures {(μi ,Ci ,wi)} such that each pair of means is separated by ||Ci||2(1/wi+1/wj), yet upon spectral projection the mixture collapses completely, i.e., all means and covariance matrices in the projected mixture are identical.