Automatic subspace clustering of high dimensional data for data mining applications
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Data mining: concepts and techniques
Data mining: concepts and techniques
WaveCluster: A Multi-Resolution Clustering Approach for Very Large Spatial Databases
VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Optimal Grid-Clustering: Towards Breaking the Curse of Dimensionality in High-Dimensional Clustering
VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
Streaming-Data Algorithms for High-Quality Clustering
ICDE '02 Proceedings of the 18th International Conference on Data Engineering
Statistical grid-based clustering over data streams
ACM SIGMOD Record
Subspace Selection for Clustering High-Dimensional Data
ICDM '04 Proceedings of the Fourth IEEE International Conference on Data Mining
A framework for clustering evolving data streams
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
A framework for projected clustering of high dimensional data streams
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
A grid-based clustering algorithm for high-dimensional data streams
ADMA'05 Proceedings of the First international conference on Advanced Data Mining and Applications
Hi-index | 0.00 |
Many applications require the clustering of high-dimensional data streams. We propose a subspace clustering algorithm that can find clusters in different subspaces through one pass over a data stream. The algorithm combines the bottom-up grid-based method and top-down grid-based method. A uniformly partitioned grid data structure is used to summarize the data stream online. The top-down grid partition method is used o find the subspaces in which clusters locate. The errors made by the top-down partition procedure are eliminated by a mergence step in our algorithm. Our performance study with real datasets and synthetic dataset demonstrates the efficiency and effectiveness of our proposed algorithm.