The Robustness of the p-Norm Algorithms
Machine Learning
The Effectiveness of Lloyd-Type Methods for the k-Means Problem
FOCS '06 Proceedings of the 47th Annual IEEE Symposium on Foundations of Computer Science
Clustering with Bregman Divergences
The Journal of Machine Learning Research
SODA '07 Proceedings of the eighteenth annual ACM-SIAM symposium on Discrete algorithms
k-means++: the advantages of careful seeding
SODA '07 Proceedings of the eighteenth annual ACM-SIAM symposium on Discrete algorithms
Clustering for metric and non-metric distance measures
Proceedings of the nineteenth annual ACM-SIAM symposium on Discrete algorithms
Fitting the smallest enclosing bregman ball
ECML'05 Proceedings of the 16th European conference on Machine Learning
On the optimality of conditional expectation as a Bregman predictor
IEEE Transactions on Information Theory
Clustering for metric and nonmetric distance measures
ACM Transactions on Algorithms (TALG)
Bregman clustering for separable instances
SWAT'10 Proceedings of the 12th Scandinavian conference on Algorithm Theory
Non-linear book manifolds: learning from associations the dynamic geometry of digital libraries
Proceedings of the 13th ACM/IEEE-CS joint conference on Digital libraries
Hi-index | 0.00 |
Two recent breakthroughs have dramatically improved the scope and performance of k-means clustering: squared Euclidean seeding for the initialization step, and Bregman clustering for the iterative step. In this paper, we first unite the two frameworks by generalizing the former improvement to Bregman seeding-- a biased randomized seeding technique using Bregman divergences -- while generalizing its important theoretical approximation guarantees as well. We end up with a complete Bregman hard clustering algorithm integrating the distortion at hand in both the initialization and iterative steps. Our second contribution is to further generalize this algorithm to handle mixed Bregman distortions, which smooth out the asymetricity of Bregman divergences. In contrast to some other symmetrization approaches, our approach keeps the algorithm simple and allows us to generalize theoretical guarantees from regular Bregman clustering. Preliminary experiments show that using the proposed seeding with a suitable Bregman divergence can help us discover the underlying structure of the data.