BIRCH: an efficient data clustering method for very large databases
SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
CURE: an efficient clustering algorithm for large databases
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Automatic subspace clustering of high dimensional data for data mining applications
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Entropy-based subspace clustering for mining numerical data
KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Maintaining stream statistics over sliding windows: (extended abstract)
SODA '02 Proceedings of the thirteenth annual ACM-SIAM symposium on Discrete algorithms
Streaming-Data Algorithms for High-Quality Clustering
ICDE '02 Proceedings of the 18th International Conference on Data Engineering
Pattern Classification (2nd Edition)
Pattern Classification (2nd Edition)
Statistical grid-based clustering over data streams
ACM SIGMOD Record
ACM SIGMOD Record
A Generic Framework for Efficient Subspace Clustering of High-Dimensional Data
ICDM '05 Proceedings of the Fifth IEEE International Conference on Data Mining
Cell trees: An adaptive synopsis structure for clustering multi-dimensional on-line data streams
Data & Knowledge Engineering
A framework for clustering evolving data streams
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Memory efficient subspace clustering for online data streams
IDEAS '08 Proceedings of the 2008 international symposium on Database engineering & applications
Clustering data stream: A survey of algorithms
International Journal of Knowledge-based and Intelligent Engineering Systems
Efficient mining of skyline objects in subspaces over data streams
Knowledge and Information Systems
MG-join: detecting phenomena and their correlation in high dimensional data streams
Distributed and Parallel Databases
Concurrent semi-supervised learning of data streams
DaWaK'11 Proceedings of the 13th international conference on Data warehousing and knowledge discovery
Density-Based projected clustering of data streams
SUM'12 Proceedings of the 6th international conference on Scalable Uncertainty Management
Hi-index | 0.00 |
A real-life data stream usually contains many dimensions and some dimensional values of its data elements may be missing. In order to effectively extract the on-going change of a data stream with respect to all the subsets of the dimensions of the data stream, a grid-based subspace clustering algorithm is proposed in this paper. Given an n-dimensional data stream, the on-going distribution statistics of data elements in each one-dimension data space is firstly monitored by a list of grid-cells called a sibling list. Once a dense grid-cell of a first-level sibling list becomes a dense unit grid-cell, new second-level sibling lists are created as its child nodes in order to trace any cluster in all possible two-dimensional rectangular subspaces. In such a way, a sibling tree grows up to the nth level at most and a k-dimensional subcluster can be found in the kth level of the sibling tree. The proposed method is comparatively analyzed by a series of experiments to identify its various characteristics.