Optimal algorithms for approximate clustering

Authors:
Tomás Feder;Daniel Greene
Affiliations:
Computer Science Department, Stanford University, Stanford, CA;Xerox Palo Alto Research Center, Palo Alto, CA
Venue:
STOC '88 Proceedings of the twentieth annual ACM symposium on Theory of computing
Year:
1988

Citing 4
Cited 97

Computational geometry: an introduction

Computational geometry: an introduction
A unified approach to approximation algorithms for bottleneck problems

Journal of the ACM (JACM)
Computers and Intractability: A Guide to the Theory of NP-Completeness

Computers and Intractability: A Guide to the Theory of NP-Completeness
Lower bounds for algebraic computation trees

STOC '83 Proceedings of the fifteenth annual ACM symposium on Theory of computing

Efficient sequential and parallel algorithms for computing recovery points in trees and paths

SODA '91 Proceedings of the second annual ACM-SIAM symposium on Discrete algorithms
Planar geometric location problems and maintaining the width of a planar set

SODA '91 Proceedings of the second annual ACM-SIAM symposium on Discrete algorithms
e-approximations with minimum packing constraint violation (extended abstract)

STOC '92 Proceedings of the twenty-fourth annual ACM symposium on Theory of computing
Geometric embeddings for faster and better multi-way netlist partitioning

DAC '93 Proceedings of the 30th international Design Automation Conference
Applications of weighted Voronoi diagrams and randomization to variance-based k-clustering: (extended abstract)

SCG '94 Proceedings of the tenth annual symposium on Computational geometry
Incremental clustering and dynamic information retrieval

STOC '97 Proceedings of the twenty-ninth annual ACM symposium on Theory of computing
Segmentation problems

STOC '98 Proceedings of the thirtieth annual ACM symposium on Theory of computing
Static and dynamic information organization with star clusters

Proceedings of the seventh international conference on Information and knowledge management
A study of retrospective and on-line event detection

Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
An optimal algorithm for approximate nearest neighbor searching fixed dimensions

Journal of the ACM (JACM)
Efficient algorithms for geometric optimization

ACM Computing Surveys (CSUR)
Finding subsets maximizing minimum structures

Proceedings of the sixth annual ACM-SIAM symposium on Discrete algorithms
A practical clustering algorithm for static and dynamic information organization

Proceedings of the tenth annual ACM-SIAM symposium on Discrete algorithms
An optimal algorithm for approximate nearest neighbor searching

SODA '94 Proceedings of the fifth annual ACM-SIAM symposium on Discrete algorithms
Exact and approximation algorithms for clustering

Proceedings of the ninth annual ACM-SIAM symposium on Discrete algorithms
Approximation algorithms for projective clustering

SODA '00 Proceedings of the eleventh annual ACM-SIAM symposium on Discrete algorithms
Sublinear time approximate clustering

SODA '01 Proceedings of the twelfth annual ACM-SIAM symposium on Discrete algorithms
Algorithms for facility location problems with outliers

SODA '01 Proceedings of the twelfth annual ACM-SIAM symposium on Discrete algorithms
Approximation algorithms for the mobile piercing set problem with applications to clustering in ad-hoc networks

DIALM '02 Proceedings of the 6th international workshop on Discrete algorithms and methods for mobile computing and communications
Approximating uniform triangular meshes in polygons

Theoretical Computer Science
Approximation algorithms for clustering to minimize the sum of diameters

Nordic Journal of Computing
Approximation Algorithms for Clustering to Minimize the Sum of Diameters

SWAT '00 Proceedings of the 7th Scandinavian Workshop on Algorithm Theory
Approximating Uniform Triangular Meshes for Spheres

JCDCG '00 Revised Papers from the Japanese Conference on Discrete and Computational Geometry
Approximating Uniform Triangular Meshes in Polygons

COCOON '00 Proceedings of the 6th Annual International Conference on Computing and Combinatorics
On Some Optimization Problems in Obnoxious Facility Location

COCOON '00 Proceedings of the 6th Annual International Conference on Computing and Combinatorics
Approximation Algorithms for Hamming Clustering Problems

COM '00 Proceedings of the 11th Annual Symposium on Combinatorial Pattern Matching
Performance Guarantees for Hierarchical Clustering

COLT '02 Proceedings of the 15th Annual Conference on Computational Learning Theory
Clustering Data Streams: Theory and Practice

IEEE Transactions on Knowledge and Data Engineering
Approximation algorithms for projective clustering

Journal of Algorithms
Structured importance sampling of environment maps

ACM SIGGRAPH 2003 Papers
Algorithmic luckiness

The Journal of Machine Learning Research
Progressive scattered data filtering

Journal of Computational and Applied Mathematics
Change Profiles

ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
Segmentation problems

Journal of the ACM (JACM)
Approximation algorithms for the mobile piercing set problem with applications to clustering in ad-hoc networks

Mobile Networks and Applications - Discrete algorithms and methods for mobile computing and communications
Deformable spanners and applications

SCG '04 Proceedings of the twentieth annual symposium on Computational geometry
On coresets for k-means and k-median clustering

STOC '04 Proceedings of the thirty-sixth annual ACM symposium on Theory of computing
A k-Median Algorithm with Running Time Independent of Data Size

Machine Learning
A New Conceptual Clustering Framework

Machine Learning
Antipole Tree Indexing to Support Range Search and K-Nearest Neighbor Search in Metric Spaces

IEEE Transactions on Knowledge and Data Engineering
Fast construction of nets in low dimensional metrics, and their applications

SCG '05 Proceedings of the twenty-first annual symposium on Computational geometry
A simple linear algorithm for computing rectilinear 3-centers

Computational Geometry: Theory and Applications - Special issue: The 11th Candian conference on computational geometry - CCCG 99
Utility based sensor selection

Proceedings of the 5th international conference on Information processing in sensor networks
Frequency-based views to pattern collections

Discrete Applied Mathematics - Special issue: Discrete mathematics & data mining II (DM & DM II)
A scalable algorithm for high-quality clustering of web snippets

Proceedings of the 2006 ACM symposium on Applied computing
Block-quantized kernel matrix for fast spectral embedding

ICML '06 Proceedings of the 23rd international conference on Machine learning
Deformable spanners and applications

Computational Geometry: Theory and Applications
Approximating relay placement in sensor networks

Proceedings of the 3rd ACM international workshop on Performance evaluation of wireless ad hoc, sensor and ubiquitous networks
A dimensionality reduction algorithm and its application for interactive visualization

Journal of Visual Languages and Computing
VISTO: visual storyboard for web video browsing

Proceedings of the 6th ACM international conference on Image and video retrieval
Squarepants in a tree: sum of subtree clustering and hyperbolic pants decomposition

SODA '07 Proceedings of the eighteenth annual ACM-SIAM symposium on Discrete algorithms
Joint cluster analysis of attribute data and relationship data: The connected k-center problem, algorithms and applications

ACM Transactions on Knowledge Discovery from Data (TKDD)
An Almost Linear Time 2.8334-Approximation Algorithm for the Disc Covering Problem

AAIM '07 Proceedings of the 3rd international conference on Algorithmic Aspects in Information and Management
Power Assignment Problems in Wireless Communication: Covering Points by Disks, Reaching few Receivers Quickly, and Energy-Efficient Travelling Salesman Tours

DCOSS '08 Proceedings of the 4th IEEE international conference on Distributed Computing in Sensor Systems
Packing and Covering δ-Hyperbolic Spaces by Balls

APPROX '07/RANDOM '07 Proceedings of the 10th International Workshop on Approximation and the 11th International Workshop on Randomization, and Combinatorial Optimization. Algorithms and Techniques
Density-weighted nyström method for computing large kernel eigensystems

Neural Computation
Small-size ε-nets for axis-parallel rectangles and boxes

Proceedings of the forty-first annual ACM symposium on Theory of computing
Squarepants in a tree: Sum of subtree clustering and hyperbolic pants decomposition

ACM Transactions on Algorithms (TALG)
Near-linear approximation algorithms for geometric hitting sets

Proceedings of the twenty-fifth annual symposium on Computational geometry
On the set multi-cover problem in geometric settings

Proceedings of the twenty-fifth annual symposium on Computational geometry
Similarity caching

Proceedings of the twenty-eighth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Exceeding expectations and clustering uncertain data

Proceedings of the twenty-eighth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Efficient approximation algorithms for clustering point-sets

Computational Geometry: Theory and Applications
A generalized minimum cost k-clustering

ACM Transactions on Algorithms (TALG)
EDISKCO: energy efficient distributed in-sensor-network k-center clustering with outliers

Proceedings of the Third International Workshop on Knowledge Discovery from Sensor Data
Exploratory modeling with collaborative design spaces

ACM SIGGRAPH Asia 2009 papers
A simple linear algorithm for computing rectilinear 3-centers

Computational Geometry: Theory and Applications - Special issue: The 11th Candian conference on computational geometry - CCCG 99
Deformable spanners and applications

Computational Geometry: Theory and Applications
A context quantization approach to universal denoising

IEEE Transactions on Signal Processing
The Directed Hausdorff Distance between Imprecise Point Sets

ISAAC '09 Proceedings of the 20th International Symposium on Algorithms and Computation
Frequency-based views to pattern collections

Discrete Applied Mathematics - Special issue: Discrete mathematics & data mining II (DM & DM II)
STIMO: STIll and MOving video storyboard for the web scenario

Multimedia Tools and Applications
On the complexity of approximation streaming algorithms for the k-center problem

FAW'07 Proceedings of the 1st annual international conference on Frontiers in algorithmics
FPF-SB: a scalable algorithm for microarray gene expression data clustering

ICDHM'07 Proceedings of the 1st international conference on Digital human modeling
Approximation algorithm for the kinetic robust K-center problem

Computational Geometry: Theory and Applications
Minimum sum-of-squares clustering by DC programming and DCA

ICIC'09 Proceedings of the Intelligent computing 5th international conference on Emerging intelligent computing technology and applications
Delay monitoring for wireless sensor networks: an architecture using air sniffers

MILCOM'09 Proceedings of the 28th IEEE conference on Military communications
Clustering lines in high-dimensional space: Classification of incomplete data

ACM Transactions on Algorithms (TALG)
Distributed antipole clustering for efficient data search and management in Euclidean and metric spaces

IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
Modelling gateway placement in wireless networks: Geometric k-centres of unit disc graphs

Computational Geometry: Theory and Applications
Small-Size $\eps$-Nets for Axis-Parallel Rectangles and Boxes

SIAM Journal on Computing
Ectropy of diversity measures for populations in Euclidean space

Information Sciences: an International Journal
Note: Constrained k-center and movement to independence

Discrete Applied Mathematics
Power assignment problems in wireless communication: Covering points by disks, reaching few receivers quickly, and energy-efficient travelling salesman tours

Ad Hoc Networks
The directed Hausdorff distance between imprecise point sets

Theoretical Computer Science
Feature selection for unlabeled data

ICSI'11 Proceedings of the Second international conference on Advances in swarm intelligence - Volume Part II
Coresets for discrete integration and clustering

FSTTCS'06 Proceedings of the 26th international conference on Foundations of Software Technology and Theoretical Computer Science
Data reduction for weighted and outlier-resistant clustering

Proceedings of the twenty-third annual ACM-SIAM symposium on Discrete Algorithms
A randomized algorithm for online unit clustering

WAOA'06 Proceedings of the 4th international conference on Approximation and Online Algorithms
Approximating low-dimensional coverage problems

Proceedings of the twenty-eighth annual symposium on Computational geometry
Survey: Some results of Christos Papadimitriou on internet structure, network routing, and web information

Computer Science Review
Survey: Graph clustering

Computer Science Review
On the set multicover problem in geometric settings

ACM Transactions on Algorithms (TALG)
The euclidean k-supplier problem

IPCO'13 Proceedings of the 16th international conference on Integer Programming and Combinatorial Optimization
A semi-supervised feature selection method using a non-parametric technique with pairwise instance constraints

Journal of Information Science
A framework for evaluating approximation methods for Gaussian process regression

The Journal of Machine Learning Research
New and efficient DCA based algorithms for minimum sum-of-squares clustering

Pattern Recognition

Quantified Score

Hi-index	0.01

Visualization

Abstract

In a clustering problem, the aim is to partition a given set of n points in d-dimensional space into k groups, called clusters, so that points within each cluster are near each other. Two objective functions frequently used to measure the performance of a clustering algorithm are, for any L4 metric, (a) the maximum distance between pairs of points in the same cluster, and (b) the maximum distance between points in each cluster and a chosen cluster center; we refer to either measure as the cluster size.We show that one cannot approximate the optimal cluster size for a fixed number of clusters within a factor close to 2 in polynomial time, for two or more dimensions, unless P=NP. We also present an algorithm that achieves this factor of 2 in time &Ogr;(n log k), and show that this running time is optimal in the algebraic decision tree model. For a fixed cluster size, on the other hand, we give a polynomial time approximation scheme that estimates the optimal number of clusters under the second measure of cluster size within factors arbitrarily close to 1. Our approach is extended to provide approximation algorithms for the restricted centers, suppliers, and weighted suppliers problems that run in optimal &Ogr;(n log k) time and achieve optimal or nearly optimal approximation bounds.