Mixed Bregman Clustering with Approximation Guarantees

Authors:
Richard Nock;Panu Luosto;Jyrki Kivinen
Affiliations:
CEREGMIA -- Université Antilles-Guyane, Schoelcher, France;Department of Computer Science, University of Helsinki, Finland;Department of Computer Science, University of Helsinki, Finland
Venue:
ECML PKDD '08 Proceedings of the European conference on Machine Learning and Knowledge Discovery in Databases - Part II
Year:
2008

Citing 9
Cited 3

Relative Loss Bounds for On-Line Density Estimation with the Exponential Family of Distributions

Machine Learning
The Robustness of the p-Norm Algorithms

Machine Learning
The Effectiveness of Lloyd-Type Methods for the k-Means Problem

FOCS '06 Proceedings of the 47th Annual IEEE Symposium on Foundations of Computer Science
Clustering with Bregman Divergences

The Journal of Machine Learning Research
On Bregman Voronoi diagrams

SODA '07 Proceedings of the eighteenth annual ACM-SIAM symposium on Discrete algorithms
k-means++: the advantages of careful seeding

SODA '07 Proceedings of the eighteenth annual ACM-SIAM symposium on Discrete algorithms
Clustering for metric and non-metric distance measures

Proceedings of the nineteenth annual ACM-SIAM symposium on Discrete algorithms
Fitting the smallest enclosing bregman ball

ECML'05 Proceedings of the 16th European conference on Machine Learning
On the optimality of conditional expectation as a Bregman predictor

IEEE Transactions on Information Theory

Clustering for metric and nonmetric distance measures

ACM Transactions on Algorithms (TALG)
Bregman clustering for separable instances

SWAT'10 Proceedings of the 12th Scandinavian conference on Algorithm Theory
Non-linear book manifolds: learning from associations the dynamic geometry of digital libraries

Proceedings of the 13th ACM/IEEE-CS joint conference on Digital libraries

Quantified Score

Hi-index	0.00

Visualization

Abstract

Two recent breakthroughs have dramatically improved the scope and performance of k-means clustering: squared Euclidean seeding for the initialization step, and Bregman clustering for the iterative step. In this paper, we first unite the two frameworks by generalizing the former improvement to Bregman seeding-- a biased randomized seeding technique using Bregman divergences -- while generalizing its important theoretical approximation guarantees as well. We end up with a complete Bregman hard clustering algorithm integrating the distortion at hand in both the initialization and iterative steps. Our second contribution is to further generalize this algorithm to handle mixed Bregman distortions, which smooth out the asymetricity of Bregman divergences. In contrast to some other symmetrization approaches, our approach keeps the algorithm simple and allows us to generalize theoretical guarantees from regular Bregman clustering. Preliminary experiments show that using the proposed seeding with a suitable Bregman divergence can help us discover the underlying structure of the data.