Robust PCA and clustering in noisy mixtures

  • Authors:
  • S. Charles Brubaker

  • Affiliations:
  • Georgia Institute of Technology

  • Venue:
  • SODA '09 Proceedings of the twentieth Annual ACM-SIAM Symposium on Discrete Algorithms
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper presents a polynomial algorithm for learning mixtures of logconcave distributions in Rn in the presence of malicious noise. That is, each sample is corrupted with some small probability, being replaced by a point about which we can make no assumptions. A key element of the algorithm is Robust Principle Components Analysis (PCA), which is less susceptible to corruption by noisy points. While noise may cause standard PCA to collapse well-separated mixture components so that they are indistinguishable, Robust PCA preserves the distance between some of the components, making a partition possible. It then recurses on each half of the mixture until every component is isolated. The success of this algorithm requires only a O(log n) factor increase in the required separation between components of the mixture compared to the noiseless case.