Subspace clustering for high dimensional data: a review
ACM SIGKDD Explorations Newsletter - Special issue on learning from imbalanced datasets
Discovering personally meaningful places: An interactive clustering approach
ACM Transactions on Information Systems (TOIS)
Towards a digital archive for handwritten paper slips with ethnological contents
ICADL'07 Proceedings of the 10th international conference on Asian digital libraries: looking back 10 years and forging new frontiers
Distributed data mining methodology for clustering and classification model
ICAISC'10 Proceedings of the 10th international conference on Artificial intelligence and soft computing: Part I
Computer Science Review
Clustering high dimensional data
Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery
Subspace clustering of high-dimensional data: an evolutionary approach
Applied Computational Intelligence and Soft Computing
Hi-index | 0.00 |
Clustering large data sets of high dimensionality hasalways been a challenge for clustering algorithms. Manyrecently developed clustering algorithms have attemptedto address either handling data sets with a very largenumber of records and/or with a very high number ofdimensions. This paper provides a discussion of theadvantages and limitations of existing algorithms whenthey operate on very large multidimensional data sets. Tosimultaneously overcome both the "curse ofdimensionality" and the scalability problems associatedwith large amounts of data, we propose a new clusteringalgorithm called O-Cluster. O-Cluster combines a novelactive sampling technique with an axis-parallelpartitioning strategy to identify continuous areas of highdensity in the input space. The method operates on alimited memory buffer and requires at most a single scanthrough the data. We demonstrate the high quality of theobtained clustering solutions, their robustness to noise,and O-Cluster's excellent scalability.