Automatic subspace clustering of high dimensional data for data mining applications
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Sublinear time algorithms for metric space problems
STOC '99 Proceedings of the thirty-first annual ACM symposium on Theory of computing
Fast algorithms for projected clustering
SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Finding generalized projected clusters in high dimensional spaces
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Approximation algorithms for projective clustering
SODA '00 Proceedings of the eleventh annual ACM-SIAM symposium on Discrete algorithms
Projective clustering in high dimensions using core-sets
Proceedings of the eighteenth annual symposium on Computational geometry
A Monte Carlo algorithm for fast projective clustering
Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Approximation Algorithms for k-Line Center
ESA '02 Proceedings of the 10th Annual European Symposium on Algorithms
FOCS '01 Proceedings of the 42nd IEEE symposium on Foundations of Computer Science
On coresets for k-means and k-median clustering
STOC '04 Proceedings of the thirty-sixth annual ACM symposium on Theory of computing
PODS '04 Proceedings of the twenty-third ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Matrix approximation and projective clustering via volume sampling
SODA '06 Proceedings of the seventeenth annual ACM-SIAM symposium on Discrete algorithm
Improved Approximation Algorithms for Large Matrices via Random Projections
FOCS '06 Proceedings of the 47th Annual IEEE Symposium on Foundations of Computer Science
Coresets forWeighted Facilities and Their Applications
FOCS '06 Proceedings of the 47th Annual IEEE Symposium on Foundations of Computer Science
Efficient subspace approximation algorithms
SODA '07 Proceedings of the eighteenth annual ACM-SIAM symposium on Discrete algorithms
Adaptive sampling and fast low-rank matrix approximation
APPROX'06/RANDOM'06 Proceedings of the 9th international conference on Approximation Algorithms for Combinatorial Optimization Problems, and 10th international conference on Randomization and Computation
Proceedings of the forty-first annual ACM symposium on Theory of computing
Algorithms and theory of computation handbook
A unified framework for approximating and clustering data
Proceedings of the forty-third annual ACM symposium on Theory of computing
A near-linear algorithm for projective clustering integer points
Proceedings of the twenty-third annual ACM-SIAM symposium on Discrete Algorithms
From high definition image to low space optimization
SSVM'11 Proceedings of the Third international conference on Scale Space and Variational Methods in Computer Vision
Learning Big (Image) Data via Coresets for Dictionaries
Journal of Mathematical Imaging and Vision
Hi-index | 0.00 |
We consider the problem of approximating a set P of n points in Rd by a collection of j-dimensional flats, andextensions thereof, under the standard median / mean / centermeasures, in which we wish to minimize, respectively, the sum of thedistances from each point of P to its nearest flat, the sum of thesquares of these distances, or the maximal such distance.Such problems cannot be approximated unless P=NP but do allowbi-criteria approximations where one allows some leeway in both the numberof flats and the quality of the objective function.We give a very simple bi-criteria approximation algorithm, which producesat most α(k,j,n) = (k j log n)O(j) flats, which exceeds the optimalobjective value for any k j-dimensional flats by a factor of nomore than β(j)= 2O(j). Given this bi-criteria approximation, wecan use it to reduce the approximation factor arbitrarily, at the costof increasing the number of flats. Our algorithm hasmany advantages over previous work, in that it is muchmore widely applicable (wider set of objective functions and classes ofclusters) and much more efficient -- reducing the running time bound from O(n Poly(k,j)) to nd · (jk)O(j).Our algorithm is randomized and successful with probability 1/2(easily boosted to probabilities arbitrary close to 1).