Bregman clustering for separable instances

Authors:
Marcel R. Ackermann;Johannes Blömer
Affiliations:
Department of Computer Science, University of Paderborn, Germany;Department of Computer Science, University of Paderborn, Germany
Venue:
SWAT'10 Proceedings of the 12th Scandinavian conference on Algorithm Theory
Year:
2010

Citing 22
Cited 1

Approximation schemes for Euclidean k-medians and related problems

STOC '98 Proceedings of the thirtieth annual ACM symposium on Theory of computing
Approximate clustering via core-sets

STOC '02 Proceedings of the thiry-fourth annual ACM symposium on Theory of computing
An Efficient k-Means Clustering Algorithm: Analysis and Implementation

IEEE Transactions on Pattern Analysis and Machine Intelligence
A Nearly Linear-Time Approximation Scheme for the Euclidean kappa-median Problem

ESA '99 Proceedings of the 7th Annual European Symposium on Algorithms
Approximation schemes for clustering problems

Proceedings of the thirty-fifth annual ACM symposium on Theory of computing
On coresets for k-means and k-median clustering

STOC '04 Proceedings of the thirty-sixth annual ACM symposium on Theory of computing
A Simple Linear Time (1+ ") -Approximation Algorithm for k-Means Clustering in Any Dimensions

FOCS '04 Proceedings of the 45th Annual IEEE Symposium on Foundations of Computer Science
On k-Median clustering in high dimensions

SODA '06 Proceedings of the seventeenth annual ACM-SIAM symposium on Discrete algorithm
The Effectiveness of Lloyd-Type Methods for the k-Means Problem

FOCS '06 Proceedings of the 47th Annual IEEE Symposium on Foundations of Computer Science
Clustering with Bregman Divergences

The Journal of Machine Learning Research
A PTAS for k-means clustering based on weak coresets

SCG '07 Proceedings of the twenty-third annual symposium on Computational geometry
k-means++: the advantages of careful seeding

SODA '07 Proceedings of the eighteenth annual ACM-SIAM symposium on Discrete algorithms
Clustering for metric and non-metric distance measures

Proceedings of the nineteenth annual ACM-SIAM symposium on Discrete algorithms
Mixed Bregman Clustering with Approximation Guarantees

ECML PKDD '08 Proceedings of the European conference on Machine Learning and Knowledge Discovery in Databases - Part II
Coresets and approximate clustering for Bregman divergences

SODA '09 Proceedings of the twentieth Annual ACM-SIAM Symposium on Discrete Algorithms
k-means requires exponentially many iterations even in the plane

Proceedings of the twenty-fifth annual symposium on Computational geometry
Adaptive Sampling for k-Means Clustering

APPROX '09 / RANDOM '09 Proceedings of the 12th International Workshop and 13th International Workshop on Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques
On Coresets for $k$-Median and $k$-Means Clustering in Metric and Euclidean Spaces and Their Applications

SIAM Journal on Computing
Worst-Case and Smoothed Analysis of k-Means Clustering with Bregman Divergences

ISAAC '09 Proceedings of the 20th International Symposium on Algorithms and Computation
k-Means Has Polynomial Smoothed Complexity

FOCS '09 Proceedings of the 2009 50th Annual IEEE Symposium on Foundations of Computer Science
Linear time algorithms for clustering problems in any dimensions

ICALP'05 Proceedings of the 32nd international conference on Automata, Languages and Programming
On the optimality of conditional expectation as a Bregman predictor

IEEE Transactions on Information Theory

Approximate bregman near neighbors in sublinear time: beyond the triangle inequality

Proceedings of the twenty-eighth annual symposium on Computational geometry

Quantified Score

Hi-index	0.00

Visualization

Abstract

The Bregman k-median problem is defined as follows. Given a Bregman divergence Dφ and a finite set $P \subseteq {\mathbb R}^d$ of size n, our goal is to find a set C of size k such that the sum of errors cost(P,C)=∑p∈P min c∈C Dφ(p,c) is minimized. The Bregman k-median problem plays an important role in many applications, e.g., information theory, statistics, text classification, and speech processing. We study a generalization of the kmeans++ seeding of Arthur and Vassilvitskii (SODA '07). We prove for an almost arbitrary Bregman divergence that if the input set consists of k well separated clusters, then with probability $2^{-{\mathcal O}(k)}$ this seeding step alone finds an ${\mathcal O}(1)$-approximate solution. Thereby, we generalize an earlier result of Ostrovsky et al. (FOCS '06) from the case of the Euclidean k-means problem to the Bregman k-median problem. Additionally, this result leads to a constant factor approximation algorithm for the Bregman k-median problem using at most $2^{{\mathcal O}(k)}n$ arithmetic operations, including evaluations of Bregman divergence Dφ.