Decision theoretic generalizations of the PAC model for neural net and other learning applications
Information and Computation
ACM Computing Surveys (CSUR)
Sublinear time approximate clustering
SODA '01 Proceedings of the twelfth annual ACM-SIAM symposium on Discrete algorithms
Approximate clustering via core-sets
STOC '02 Proceedings of the thiry-fourth annual ACM symposium on Theory of computing
Parallel Optimization: Theory, Algorithms and Applications
Parallel Optimization: Theory, Algorithms and Applications
A divisive information theoretic feature clustering algorithm for text classification
The Journal of Machine Learning Research
On coresets for k-means and k-median clustering
STOC '04 Proceedings of the thirty-sixth annual ACM symposium on Theory of computing
A Simple Linear Time (1+ ") -Approximation Algorithm for k-Means Clustering in Any Dimensions
FOCS '04 Proceedings of the 45th Annual IEEE Symposium on Foundations of Computer Science
Coresets in dynamic geometric data streams
Proceedings of the thirty-seventh annual ACM symposium on Theory of computing
Smaller coresets for k-median and k-means clustering
SCG '05 Proceedings of the twenty-first annual symposium on Computational geometry
On k-Median clustering in high dimensions
SODA '06 Proceedings of the seventeenth annual ACM-SIAM symposium on Discrete algorithm
Clustering with Bregman Divergences
The Journal of Machine Learning Research
A PTAS for k-means clustering based on weak coresets
SCG '07 Proceedings of the twenty-third annual symposium on Computational geometry
SODA '07 Proceedings of the eighteenth annual ACM-SIAM symposium on Discrete algorithms
k-means++: the advantages of careful seeding
SODA '07 Proceedings of the eighteenth annual ACM-SIAM symposium on Discrete algorithms
Clustering for metric and non-metric distance measures
Proceedings of the nineteenth annual ACM-SIAM symposium on Discrete algorithms
Survey of clustering algorithms
IEEE Transactions on Neural Networks
Worst-Case and Smoothed Analysis of k-Means Clustering with Bregman Divergences
ISAAC '09 Proceedings of the 20th International Symposium on Algorithms and Computation
Approximation algorithms for tensor clustering
ALT'09 Proceedings of the 20th international conference on Algorithmic learning theory
Clustering for metric and nonmetric distance measures
ACM Transactions on Algorithms (TALG)
Smoothed Analysis of the k-Means Method
Journal of the ACM (JACM)
Bregman clustering for separable instances
SWAT'10 Proceedings of the 12th Scandinavian conference on Algorithm Theory
Approximate bregman near neighbors in sublinear time: beyond the triangle inequality
Proceedings of the twenty-eighth annual symposium on Computational geometry
Hi-index | 0.00 |
We study the generalized k-median problem with respect to a Bregman divergence Dφ. Given a finite set P ⊆ Rd of size n, our goal is to find a set C of size k such that the sum of errors cost(P, C) = ΣpεP mincεC {DΦ(p, c)} is minimized. The Bregman k-median problem plays an important role in many applications, e.g. information theory, statistics, text classification, and speech processing. We give the first coreset construction for this problem for a large subclass of Bregman divergences, including important dissimilarity measures such as the Kullback-Leibler divergence and the Itakura-Saito divergence. Using these coresets, we give a (1 + ε)-approximation algorithm for the Bregman k-median problem with running time O (dkn + d22(k/c)thetas;(1) logk+2n). This result improves over the previousely fastest known (1 + ε)-approximation algorithm from [1]. Unlike the analysis of most coreset constructions our analysis does not rely on the construction of ε-nets. Instead, we prove our results by purely combinatorial means.