Automatic subspace clustering of high dimensional data for data mining applications
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Fast algorithms for projected clustering
SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Mining frequent patterns without candidate generation
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Finding generalized projected clusters in high dimensional spaces
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Data mining: concepts and techniques
Data mining: concepts and techniques
A Monte Carlo algorithm for fast projective clustering
Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Biclustering of Expression Data
Proceedings of the Eighth International Conference on Intelligent Systems for Molecular Biology
Fast Algorithms for Mining Association Rules in Large Databases
VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Frequent-Pattern based Iterative Projected Clustering
ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
SCHISM: A New Approach for Interesting Subspace Mining
ICDM '04 Proceedings of the Fourth IEEE International Conference on Data Mining
A Generic Framework for Efficient Subspace Clustering of High-Dimensional Data
ICDM '05 Proceedings of the Fifth IEEE International Conference on Data Mining
Comparing Subspace Clusterings
IEEE Transactions on Knowledge and Data Engineering
VLDB '06 Proceedings of the 32nd international conference on Very large data bases
P3C: A Robust Projected Clustering Algorithm
ICDM '06 Proceedings of the Sixth International Conference on Data Mining
Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)
Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
Morpheus: interactive exploration of subspace clustering
Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
The Chosen Few: On Identifying Valuable Patterns
ICDM '07 Proceedings of the 2007 Seventh IEEE International Conference on Data Mining
DUSC: Dimensionality Unbiased Subspace Clustering
ICDM '07 Proceedings of the 2007 Seventh IEEE International Conference on Data Mining
INSCY: Indexing Subspace Clusters with In-Process-Removal of Redundancy
ICDM '08 Proceedings of the 2008 Eighth IEEE International Conference on Data Mining
Adaptive outlierness for subspace outlier ranking
CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
SOREX: subspace outlier ranking exploration toolkit
ECML PKDD'10 Proceedings of the 2010 European conference on Machine learning and knowledge discovery in databases: Part III
CoDA: interactive cluster based concept discovery
Proceedings of the VLDB Endowment
Agent-based subspace clustering
PAKDD'11 Proceedings of the 15th Pacific-Asia conference on Advances in knowledge discovery and data mining - Volume Part II
Tracing evolving clusters by subspace and value similarity
PAKDD'11 Proceedings of the 15th Pacific-Asia conference on Advances in knowledge discovery and data mining - Volume Part II
An extension of the PMML standard to subspace clustering models
Proceedings of the 2011 workshop on Predictive markup language modeling
Efficient selectivity estimation by histogram construction based on subspace clustering
SSDBM'11 Proceedings of the 23rd international conference on Scientific and statistical database management
Comparing apples and oranges: measuring differences between data mining results
ECML PKDD'11 Proceedings of the 2011 European conference on Machine learning and knowledge discovery in databases - Volume Part III
Designing an ensemble classifier over subspace classifiers using iterative convergence routine
Proceedings of the 20th ACM international conference on Information and knowledge management
Scalable density-based subspace clustering
Proceedings of the 20th ACM international conference on Information and knowledge management
External evaluation measures for subspace clustering
Proceedings of the 20th ACM international conference on Information and knowledge management
Tracing Evolving Subspace Clusters in Temporal Climate Data
Data Mining and Knowledge Discovery
Subgraph mining on directed and weighted graphs
PAKDD'10 Proceedings of the 14th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining - Volume Part II
Clustering high dimensional data
Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery
An evolutionary subspace clustering algorithm for high-dimensional data
Proceedings of the 14th annual conference companion on Genetic and evolutionary computation
Mining of temporal coherent subspace clusters in multivariate time series databases
PAKDD'12 Proceedings of the 16th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining - Volume Part I
Sensitivity of self-tuning histograms: query order affecting accuracy and robustness
SSDBM'12 Proceedings of the 24th international conference on Scientific and Statistical Database Management
A survey on enhanced subspace clustering
Data Mining and Knowledge Discovery
Projective clustering ensembles
Data Mining and Knowledge Discovery
WI-IAT '12 Proceedings of the The 2012 IEEE/WIC/ACM International Joint Conferences on Web Intelligence and Intelligent Agent Technology - Volume 03
GPUMAFIA: efficient subspace clustering with MAFIA on GPUs
Euro-Par'13 Proceedings of the 19th international conference on Parallel Processing
Evolving soft subspace clustering
Applied Soft Computing
Hi-index | 0.00 |
Clustering high dimensional data is an emerging research field. Subspace clustering or projected clustering group similar objects in subspaces, i.e. projections, of the full space. In the past decade, several clustering paradigms have been developed in parallel, without thorough evaluation and comparison between these paradigms on a common basis. Conclusive evaluation and comparison is challenged by three major issues. First, there is no ground truth that describes the "true" clusters in real world data. Second, a large variety of evaluation measures have been used that reflect different aspects of the clustering result. Finally, in typical publications authors have limited their analysis to their favored paradigm only, while paying other paradigms little or no attention. In this paper, we take a systematic approach to evaluate the major paradigms in a common framework. We study representative clustering algorithms to characterize the different aspects of each paradigm and give a detailed comparison of their properties. We provide a benchmark set of results on a large variety of real world and synthetic data sets. Using different evaluation measures, we broaden the scope of the experimental analysis and create a common baseline for future developments and comparable evaluations in the field. For repeatability, all implementations, data sets and evaluation measures are available on our website.