Approximate trace of grid-based clusters over high dimensional data streams

Authors:
Nam Hun Park;Won Suk Lee
Affiliations:
Department of Computer Science, Yonsei University, Seoul, Korea;Department of Computer Science, Yonsei University, Seoul, Korea
Venue:
PAKDD'07 Proceedings of the 11th Pacific-Asia conference on Advances in knowledge discovery and data mining
Year:
2007

Citing 5
Cited 0

Entropy-based subspace clustering for mining numerical data

KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Streaming-Data Algorithms for High-Quality Clustering

ICDE '02 Proceedings of the 18th International Conference on Data Engineering
Statistical grid-based clustering over data streams

ACM SIGMOD Record
Mining data streams: a review

ACM SIGMOD Record
A framework for clustering evolving data streams

VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29

Quantified Score

Hi-index	0.00

Visualization

Abstract

Clustering in a large data set of high dimensionality has always been a serious challenge in the field of data mining. A good clustering method should provide flexible scalability to the number of dimensions as well as the size of a data set. We have proposed a grid-based clustering method called a hybrid-partition method for an on-line data stream. However, as the dimensionality of a data stream is increased, the time and space complexity of this method is increased rapidly. In this paper, a sibling list is proposed to find the clusters of a multi-dimensional data space based on the one-dimensional clusters of each dimension. Although the accuracy of identified multi-dimensional clusters may be less accurate, this one-dimensional approach can provide better scalability to the number of dimensions. This is because the one-dimensional approach requires much less memory usage than the multi-dimensional approach does. Therefore, the confined space of main memory can be more effectively utilized by the one-dimensional approach.