Automatic subspace clustering of high dimensional data for data mining applications
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Finding generalized projected clusters in high dimensional spaces
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
When Is ''Nearest Neighbor'' Meaningful?
ICDT '99 Proceedings of the 7th International Conference on Database Theory
d-Clusters: Capturing Subspace Correlation in a Large Data Set
ICDE '02 Proceedings of the 18th International Conference on Data Engineering
Computing Clusters of Correlation Connected objects
SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
Machine Learning
Comparing Subspace Clusterings
IEEE Transactions on Knowledge and Data Engineering
Deriving quantitative models for correlation clusters
Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
On Exploring Complex Relationships of Correlation Clusters
SSDBM '07 Proceedings of the 19th International Conference on Scientific and Statistical Database Management
REDUS: finding reducible subspaces in high dimensional data
Proceedings of the 17th ACM conference on Information and knowledge management
Global Correlation Clustering Based on the Hough Transform
Statistical Analysis and Data Mining
ACM Transactions on Knowledge Discovery from Data (TKDD)
INSCY: Indexing Subspace Clusters with In-Process-Removal of Redundancy
ICDM '08 Proceedings of the 2008 Eighth IEEE International Conference on Data Mining
CARE: Finding Local Linear Correlations in High Dimensional Data
ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
Detection of orthogonal concepts in subspaces of high dimensional data
Proceedings of the 18th ACM conference on Information and knowledge management
A robust seedless algorithm for correlation clustering
PAKDD'10 Proceedings of the 14th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining - Volume Part I
Mining order-preserving submatrices from probabilistic matrices
ACM Transactions on Database Systems (TODS)
Semi-supervised projected model-based clustering
Data Mining and Knowledge Discovery
Hi-index | 0.00 |
The necessity to analyze subspace projections of complex data is a well-known fact in the clustering community. While the full space may be obfuscated by overlapping patterns and irrelevant dimensions, only certain subspaces are able to reveal the clustering structure. Subspace clustering discards irrelevant dimensions and allows objects to belong to multiple, overlapping clusters due to individual subspace projections for each set of objects. As we will demonstrate, the observations, which originate the need to consider subspace projections for traditional clustering, also apply for the task of correlation analysis. In this work, we introduce the novel paradigm of subspace correlation clustering: we analyze subspace projections to find subsets of objects showing linear correlations among this subset of dimensions. In contrast to existing techniques, which determine correlations based on the full-space, our method is able to exclude locally irrelevant dimensions, enabling more precise detection of the correlated features. Since we analyze subspace projections, each object can contribute to several correlations. Our model allows multiple overlapping clusters in general but simultaneously avoids redundant clusters deducible from already known correlations. We introduce the algorithm SSCC that exploits different pruning techniques to efficiently generate a subspace correlation clustering. In thorough experiments we demonstrate the strength of our novel paradigm in comparison to existing methods.