k-means projective clustering

Authors:
Pankaj K. Agarwal;Nabil H. Mustafa
Affiliations:
Duke University, Durham, NC;Duke University, Durham, NC
Venue:
PODS '04 Proceedings of the twenty-third ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Year:
2004

Citing 16
Cited 24

Algorithms for clustering data

Algorithms for clustering data
BIRCH: an efficient data clustering method for very large databases

SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
CURE: an efficient clustering algorithm for large databases

SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Automatic subspace clustering of high dimensional data for data mining applications

SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Fast algorithms for projected clustering

SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Finding generalized projected clusters in high dimensional spaces

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Centroidal Voronoi Tessellations: Applications and Algorithms

SIAM Review
Data mining: concepts and techniques

Data mining: concepts and techniques
Maintaining approximate extent measures of moving points

SODA '01 Proceedings of the twelfth annual ACM-SIAM symposium on Discrete algorithms
A Monte Carlo algorithm for fast projective clustering

Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Optimal Grid-Clustering: Towards Breaking the Curse of Dimensionality in High-Dimensional Clustering

VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
What Is the Nearest Neighbor in High Dimensional Spaces?

VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
Local Dimensionality Reduction: A New Approach to Indexing High Dimensional Spaces

VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
Efficient and Effective Clustering Methods for Spatial Data Mining

VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Near-Linear Time Approximation Algorithms for Curve Simplification

ESA '02 Proceedings of the 10th Annual European Symposium on Algorithms
Approximate Shape Fitting via Linearization

FOCS '01 Proceedings of the 42nd IEEE symposium on Foundations of Computer Science

Dimension induced clustering

Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
Matrix approximation and projective clustering via volume sampling

SODA '06 Proceedings of the seventeenth annual ACM-SIAM symposium on Discrete algorithm
Comparing Subspace Clusterings

IEEE Transactions on Knowledge and Data Engineering
How slow is the k-means method?

Proceedings of the twenty-second annual symposium on Computational geometry
Projective clustering using itemset discovery for multi-dimensional data analysis

MS'06 Proceedings of the 17th IASTED international conference on Modelling and simulation
Bi-criteria linear-time approximations for generalized k-mean/median/center

SCG '07 Proceedings of the twenty-third annual symposium on Computational geometry
k-means++: the advantages of careful seeding

SODA '07 Proceedings of the eighteenth annual ACM-SIAM symposium on Discrete algorithms
Detecting eye fixations by projection clustering

ACM Transactions on Multimedia Computing, Communications, and Applications (TOMCCAP)
2008 Special Issue: Interactive data analysis and clustering of genomic data

Neural Networks
Robust Clustering by Aggregation and Intersection Methods

KES '08 Proceedings of the 12th international conference on Knowledge-Based Intelligent Information and Engineering Systems, Part III
k-means requires exponentially many iterations even in the plane

Proceedings of the twenty-fifth annual symposium on Computational geometry
Interactive Visualization Tools for Meta-Clustering

Proceedings of the 2009 conference on New Directions in Neural Networks: 18th Italian Workshop on Neural Networks: WIRN 2008
Multiple data structure discovery through global optimisation, meta clustering and consensus methods

International Journal of Knowledge Engineering and Soft Data Paradigms
RACK: RApid clustering using K-means algorithm

CASE'09 Proceedings of the fifth annual IEEE international conference on Automation science and engineering
Global optimization, meta clustering and consensus clustering for class prediction

IJCNN'09 Proceedings of the 2009 international joint conference on Neural Networks
Clustering by random projections

ICDM'07 Proceedings of the 7th industrial conference on Advances in data mining: theoretical aspects and applications
Fast clustering using MapReduce

Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
Clustering very large multi-dimensional datasets with MapReduce

Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
Two-dimensional clustering algorithms for image segmentation

WSEAS Transactions on Computers
A near-linear algorithm for projective clustering integer points

Proceedings of the twenty-third annual ACM-SIAM symposium on Discrete Algorithms
A fuzzy subspace algorithm for clustering high dimensional data

ADMA'06 Proceedings of the Second international conference on Advanced Data Mining and Applications
Compression-aware I/O performance analysis for big data clustering

Proceedings of the 1st International Workshop on Big Data, Streams and Heterogeneous Source Mining: Algorithms, Systems, Programming Models and Applications
The single pixel GPS: learning big data signals from tiny coresets

Proceedings of the 20th International Conference on Advances in Geographic Information Systems
Clustering under approximation stability

Journal of the ACM (JACM)

Quantified Score

Hi-index	0.00

Visualization

Abstract

In many applications it is desirable to cluster high dimensional data along various subspaces, which we refer to as projective clustering. We propose a new objective function for projective clustering, taking into account the inherent trade-off between the dimension of a subspace and the induced clustering error. We then present an extension of the k-means clustering algorithm for projective clustering in arbitrary subspaces, and also propose techniques to avoid local minima. Unlike previous algorithms, ours can choose the dimension of each cluster independently and automatically. Furthermore, experimental results show that our algorithm is significantly more accurate than the previous approaches.