VISA: visual subspace clustering analysis
ACM SIGKDD Explorations Newsletter - Special issue on visual analytics
Clustering multidimensional sequences in spatial and temporal databases
Knowledge and Information Systems
Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
Morpheus: interactive exploration of subspace clustering
Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
Pleiades: Subspace Clustering and Evaluation
ECML PKDD '08 Proceedings of the European conference on Machine Learning and Knowledge Discovery in Databases - Part II
EDSC: efficient density-based subspace clustering
Proceedings of the 17th ACM conference on Information and knowledge management
ACM Transactions on Knowledge Discovery from Data (TKDD)
HSM: Heterogeneous Subspace Mining in High Dimensional Data
SSDBM 2009 Proceedings of the 21st International Conference on Scientific and Statistical Database Management
Detection of orthogonal concepts in subspaces of high dimensional data
Proceedings of the 18th ACM conference on Information and knowledge management
Subspace and projected clustering: experimental evaluation and analysis
Knowledge and Information Systems
Evaluating clustering in subspace projections of high dimensional data
Proceedings of the VLDB Endowment
Projected Gustafson Kessel Clustering
RSFDGrC '09 Proceedings of the 12th International Conference on Rough Sets, Fuzzy Sets, Data Mining and Granular Computing
SubClass: classification of multidimensional noisy data using subspace clusters
PAKDD'08 Proceedings of the 12th Pacific-Asia conference on Advances in knowledge discovery and data mining
Mining representative subspace clusters in high-dimensional data
FSKD'09 Proceedings of the 6th international conference on Fuzzy systems and knowledge discovery - Volume 1
Adaptive outlierness for subspace outlier ranking
CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
An unbiased distance-based outlier detection approach for high-dimensional data
DASFAA'11 Proceedings of the 16th international conference on Database systems for advanced applications - Volume Part I
Projected Gustafson-Kessel clustering algorithm and its convergence
Transactions on rough sets XIV
Efficient selectivity estimation by histogram construction based on subspace clustering
SSDBM'11 Proceedings of the 23rd international conference on Scientific and statistical database management
Scalable density-based subspace clustering
Proceedings of the 20th ACM international conference on Information and knowledge management
External evaluation measures for subspace clustering
Proceedings of the 20th ACM international conference on Information and knowledge management
Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery
Clustering high dimensional data
Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery
A survey on unsupervised outlier detection in high-dimensional numerical data
Statistical Analysis and Data Mining
A survey on enhanced subspace clustering
Data Mining and Knowledge Discovery
Stochastic subspace search for top-k multi-view clustering
Proceedings of the 4th MultiClust Workshop on Multiple Clusterings, Multi-view Data, and Multi-source Knowledge-driven Clustering
Finding multiple global linear correlations in sparse and noisy data sets
Knowledge-Based Systems
Hi-index | 0.01 |
To gain insight into today's large data resources, data mining provides automatic aggregation techniques. Clustering aims at grouping data such that objects within groups are similar while objects in different groups are dissimilar. In scenarios with many attributes or with noise, clusters are often hidden in subspaces of the data and do not show up in the full dimensional space. For these applications, subspace clustering methods aim at detecting clusters in any subspace. Existing subspace clustering approaches fall prey to an effect we call dimensionality bias. As dimensionality of subspaces varies, approaches which do not take this effect into account fail to separate clusters from noise. We give a formal definition of dimensionality bias and analyze consequences for subspace clustering. A dimensionality unbiased subspace clustering (DUSC) definition based on statistical foundations is proposed. In thorough experiments on synthetic and real world data, we show that our approach outperforms existing subspace clustering algorithms.