Automatic subspace clustering of high dimensional data for data mining applications
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
OPTICS: ordering points to identify the clustering structure
SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Entropy-based subspace clustering for mining numerical data
KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
ACM Computing Surveys (CSUR)
Co-clustering documents and words using bipartite spectral graph partitioning
Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
When Is ''Nearest Neighbor'' Meaningful?
ICDT '99 Proceedings of the 7th International Conference on Database Theory
On the Surprising Behavior of Distance Metrics in High Dimensional Spaces
ICDT '01 Proceedings of the 8th International Conference on Database Theory
WaveCluster: A Multi-Resolution Clustering Approach for Very Large Spatial Databases
VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Optimal Grid-Clustering: Towards Breaking the Curse of Dimensionality in High-Dimensional Clustering
VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
What Is the Nearest Neighbor in High Dimensional Spaces?
VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
STING: A Statistical Information Grid Approach to Spatial Data Mining
VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
O-Cluster: Scalable Clustering of Large High Dimensional Data Sets
ICDM '02 Proceedings of the 2002 IEEE International Conference on Data Mining
Adaptive dimension reduction for clustering high dimensional data
ICDM '02 Proceedings of the 2002 IEEE International Conference on Data Mining
On the Need for Time Series Data Mining Benchmarks: A Survey and Empirical Demonstration
Data Mining and Knowledge Discovery
Relationship-Based Clustering and Visualization for High-Dimensional Data Mining
INFORMS Journal on Computing
Subspace clustering for high dimensional data: a review
ACM SIGKDD Explorations Newsletter - Special issue on learning from imbalanced datasets
Kernel k-means: spectral clustering and normalized cuts
Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Biclustering Algorithms for Biological Data Analysis: A Survey
IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Cluster Analysis for Gene Expression Data: A Survey
IEEE Transactions on Knowledge and Data Engineering
Toward Integrating Feature Selection Algorithms for Classification and Clustering
IEEE Transactions on Knowledge and Data Engineering
Document Clustering Using Locality Preserving Indexing
IEEE Transactions on Knowledge and Data Engineering
Locally adaptive metrics for clustering high dimensional data
Data Mining and Knowledge Discovery
A survey of kernel and spectral methods for clustering
Pattern Recognition
Experiencing SAX: a novel symbolic representation of time series
Data Mining and Knowledge Discovery
Computers and Operations Research
DUSC: Dimensionality Unbiased Subspace Clustering
ICDM '07 Proceedings of the 2007 Seventh IEEE International Conference on Data Mining
ACM Transactions on Knowledge Discovery from Data (TKDD)
Subspace and projected clustering: experimental evaluation and analysis
Knowledge and Information Systems
Evaluating clustering in subspace projections of high dimensional data
Proceedings of the VLDB Endowment
Clustering of time series data-a survey
Pattern Recognition
Distance metrics for high dimensional nearest neighborhood recovery: Compression and normalization
Information Sciences: an International Journal
The curse of dimensionality in data mining and time series prediction
IWANN'05 Proceedings of the 8th international conference on Artificial Neural Networks: computational Intelligence and Bioinspired Systems
Partitive clustering (K-means family)
Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery
Survey of clustering algorithms
IEEE Transactions on Neural Networks
A survey on unsupervised outlier detection in high-dimensional numerical data
Statistical Analysis and Data Mining
Hi-index | 0.00 |
High-dimensional data, i.e., data described by a large number of attributes, pose specific challenges to clustering. The so-called ‘curse of dimensionality’, coined originally to describe the general increase in complexity of various computational problems as dimensionality increases, is known to render traditional clustering algorithms ineffective. The curse of dimensionality, among other effects, means that with increasing number of dimensions, a loss of meaningful differentiation between similar and dissimilar objects is observed. As high-dimensional objects appear almost alike, new approaches for clustering are required. Consequently, recent research has focused on developing techniques and clustering algorithms specifically for high-dimensional data. Still, open research issues remain. Clustering is a data mining task devoted to the automatic grouping of data based on mutual similarity. Each cluster groups objects that are similar to one another, whereas dissimilar objects are assigned to different clusters, possibly separating out noise. In this manner, clusters describe the data structure in an unsupervised manner, i.e., without the need for class labels. A number of clustering paradigms exist that provide different cluster models and different algorithmic approaches for cluster detection. Common to all approaches is the fact that they require some underlying assessment of similarity between data objects. In this article, we provide an overview of the effects of high-dimensional spaces, and their implications for different clustering paradigms. We review models and algorithms that address clustering in high dimensions, with pointers to the literature, and sketch open research issues. We conclude with a summary of the state of the art. © 2012 Wiley Periodicals, Inc. © 2012 Wiley Periodicals, Inc.