Learning mixtures of arbitrary gaussians

Authors:
Arora Sanjeev;Ravi Kannan
Affiliations:
Dept of Computer Science, Princeton University;Dept of Computer Science, Yale University
Venue:
STOC '01 Proceedings of the thirty-third annual ACM symposium on Theory of computing
Year:
2001

Citing 4
Cited 30

A constant-factor approximation algorithm for the k-median problem (extended abstract)

STOC '99 Proceedings of the thirty-first annual ACM symposium on Theory of computing
Estimating a mixture of two product distributions

COLT '99 Proceedings of the twelfth annual conference on Computational learning theory
Learning Mixtures of Gaussians

FOCS '99 Proceedings of the 40th Annual Symposium on Foundations of Computer Science
Sampling according to the multivariate normal density

FOCS '96 Proceedings of the 37th Annual Symposium on Foundations of Computer Science

Inferring a Union of Halfspaces from Examples

COCOON '02 Proceedings of the 8th Annual International Conference on Computing and Combinatorics
Database-friendly random projections: Johnson-Lindenstrauss with binary coins

Journal of Computer and System Sciences - Special issu on PODS 2001
Optimal Time Bounds for Approximate Clustering

Machine Learning
A spectral algorithm for learning mixture models

Journal of Computer and System Sciences - Special issue on FOCS 2002
On the impossibility of dimension reduction in l1

Journal of the ACM (JACM)
On Learning Mixtures of Heavy-Tailed Distributions

FOCS '05 Proceedings of the 46th Annual IEEE Symposium on Foundations of Computer Science
Learning mixtures of product distributions over discrete domains

FOCS '05 Proceedings of the 46th Annual IEEE Symposium on Foundations of Computer Science
The space complexity of pass-efficient algorithms for clustering

SODA '06 Proceedings of the seventeenth annual ACM-SIAM symposium on Discrete algorithm
An investigation of computational and informational limits in Gaussian mixture clustering

ICML '06 Proceedings of the 23rd international conference on Machine learning
Spectral clustering with limited independence

SODA '07 Proceedings of the eighteenth annual ACM-SIAM symposium on Discrete algorithms
A rigorous analysis of population stratification with limited data

SODA '07 Proceedings of the eighteenth annual ACM-SIAM symposium on Discrete algorithms
A discriminative framework for clustering via similarity functions

STOC '08 Proceedings of the fortieth annual ACM symposium on Theory of computing
Clustering with Interactive Feedback

ALT '08 Proceedings of the 19th international conference on Algorithmic Learning Theory
Heuristic Kalman algorithm for solving optimization problems

IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics
Are there local maxima in the infinite-sample likelihood of Gaussian mixture estimation?

COLT'07 Proceedings of the 20th annual conference on Learning theory
Separating populations with wide data: a spectral analysis

ISAAC'07 Proceedings of the 18th international conference on Algorithms and computation
Minimum sum-of-squares clustering by DC programming and DCA

ICIC'09 Proceedings of the Intelligent computing 5th international conference on Emerging intelligent computing technology and applications
Efficiently learning mixtures of two Gaussians

Proceedings of the forty-second ACM symposium on Theory of computing
Generalized clustering via kernel embeddings

KI'09 Proceedings of the 32nd annual German conference on Advances in artificial intelligence
Optimal time bounds for approximate clustering

UAI'02 Proceedings of the Eighteenth conference on Uncertainty in artificial intelligence
Disentangling Gaussians

Communications of the ACM
PAC learning axis-aligned mixtures of gaussians with no separation assumption

COLT'06 Proceedings of the 19th annual conference on Learning Theory
The spectral method for general mixture models

COLT'05 Proceedings of the 18th annual conference on Learning Theory
On spectral learning of mixtures of distributions

COLT'05 Proceedings of the 18th annual conference on Learning Theory
Toward privacy in public databases

TCC'05 Proceedings of the Second international conference on Theory of Cryptography
Gaussian kernel minimum sum-of-squares clustering and solution method based on DCA

ACIIDS'12 Proceedings of the 4th Asian conference on Intelligent Information and Database Systems - Volume Part II
Random direction divisive clustering

Pattern Recognition Letters
Learning mixtures of spherical gaussians: moment methods and spectral decompositions

Proceedings of the 4th conference on Innovations in Theoretical Computer Science
Clustering under approximation stability

Journal of the ACM (JACM)
New and efficient DCA based algorithms for minimum sum-of-squares clustering

Pattern Recognition

Quantified Score

Hi-index	0.02

Visualization

Abstract

Mixtures of gaussian (or normal) distributions arise in a variety of application areas. Many techniques have been proposed for the task of finding the component gaussians given samples from the mixture, such as the EM algorithm, a local-search heuristic from Dempster, Laird and Rubin~(1977). However, such heuristics are known to require time exponential in the dimension (i.e., number of variables) in the worst case, even when the number of components is $2$.This paper presents the first algorithm that provably learns the component gaussians in time that is polynomial in the dimension. The gaussians may have arbitrary shape provided they satisfy a “nondegeneracy” condition, which requires their high-probability regions to be not “too close” together.