Mixed Bregman Clustering with Approximation Guarantees

  • Authors:
  • Richard Nock;Panu Luosto;Jyrki Kivinen

  • Affiliations:
  • CEREGMIA -- Université Antilles-Guyane, Schoelcher, France;Department of Computer Science, University of Helsinki, Finland;Department of Computer Science, University of Helsinki, Finland

  • Venue:
  • ECML PKDD '08 Proceedings of the European conference on Machine Learning and Knowledge Discovery in Databases - Part II
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Two recent breakthroughs have dramatically improved the scope and performance of k-means clustering: squared Euclidean seeding for the initialization step, and Bregman clustering for the iterative step. In this paper, we first unite the two frameworks by generalizing the former improvement to Bregman seeding-- a biased randomized seeding technique using Bregman divergences -- while generalizing its important theoretical approximation guarantees as well. We end up with a complete Bregman hard clustering algorithm integrating the distortion at hand in both the initialization and iterative steps. Our second contribution is to further generalize this algorithm to handle mixed Bregman distortions, which smooth out the asymetricity of Bregman divergences. In contrast to some other symmetrization approaches, our approach keeps the algorithm simple and allows us to generalize theoretical guarantees from regular Bregman clustering. Preliminary experiments show that using the proposed seeding with a suitable Bregman divergence can help us discover the underlying structure of the data.