An investigation of computational and informational limits in Gaussian mixture clustering

  • Authors:
  • Nathan Srebro;Gregory Shakhnarovich;Sam Roweis

  • Affiliations:
  • University of Toronto, Toronto, Ontario, Canada;Brown University, Providence, Rhode Island;University of Toronto, Toronto, Ontario, Canada

  • Venue:
  • ICML '06 Proceedings of the 23rd international conference on Machine learning
  • Year:
  • 2006

Quantified Score

Hi-index 0.02

Visualization

Abstract

We investigate under what conditions clustering by learning a mixture of spherical Gaussians is (a) computationally tractable; and (b) statistically possible. We show that using principal component projection greatly aids in recovering the clustering using EM; present empirical evidence that even using such a projection, there is still a large gap between the number of samples needed to recover the clustering using EM, and the number of samples needed without computational restrictions; and characterize the regime in which such a gap exists.