Polynomial time approximation schemes for geometric k-clustering

Authors:
R. Ostrovsky;Y. Rabani
Affiliations:
-;-
Venue:
FOCS '00 Proceedings of the 41st Annual Symposium on Foundations of Computer Science
Year:
2000

Citing 0
Cited 9

Random projection in dimensionality reduction: applications to image and text data

Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Approximate clustering via core-sets

STOC '02 Proceedings of the thiry-fourth annual ACM symposium on Theory of computing
Clustering Data Streams: Theory and Practice

IEEE Transactions on Knowledge and Data Engineering
Clustering via matrix powering

PODS '04 Proceedings of the twenty-third ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
On the impossibility of dimension reduction in l1

Journal of the ACM (JACM)
Graph partitioning into isolated, high conductance clusters: theory, computation and applications to preconditioning

Proceedings of the twentieth annual symposium on Parallelism in algorithms and architectures
Image-mapped data clustering: An efficient technique for clustering large data sets

Intelligent Data Analysis
The delay-constrained information coverage problem in mobile sensor networks: single hop case

Wireless Networks
Coresets for discrete integration and clustering

FSTTCS'06 Proceedings of the 26th international conference on Foundations of Software Technology and Theoretical Computer Science

Quantified Score

Hi-index	0.00

Visualization

Abstract

We deal with the problem of clustering data points. Given n points in a larger set (for example, R/sup d/) endowed with a distance function (for example, L/sup 2/ distance), we would like to partition the data set into k disjoint clusters, each with a "cluster center", so as to minimize the sum over all data points of the distance between the point and the center of the cluster containing the point. The problem is provably NP-hard in some high dimensional geometric settings, even for k=2. We give polynomial time approximation schemes for this problem in several settings, including the binary cube (0, 1)/sup d/ with Hamming distance, and R/sup d/ either with L/sup 1/ distance, or with L/sup 2/ distance, or with the square of L/sup 2/ distance. In all these settings, the best previous results were constant factor approximation guarantees. We note that our problem is similar in flavor to the k-median problem (and the related facility location problem), which has been considered in graph-theoretic and fixed dimensional geometric settings, where it becomes hard when k is part of the input. In contrast, we study the problem when k is fixed, but the dimension is part of the input. Our algorithms are based on a dimension reduction construction for the Hamming cube, which may be of independent interest.