Models and issues in data stream systems
Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Maintaining stream statistics over sliding windows: (extended abstract)
SODA '02 Proceedings of the thirteenth annual ACM-SIAM symposium on Discrete algorithms
Maintaining variance and k-medians over data stream windows
Proceedings of the twenty-second ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Clustering Data Streams: Theory and Practice
IEEE Transactions on Knowledge and Data Engineering
Density-based clustering for real-time stream data
Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
A framework for clustering evolving data streams
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Tracking clusters in evolving data streams over sliding windows
Knowledge and Information Systems
A density-based clustering structure mining algorithm for data streams
Proceedings of the 1st International Workshop on Big Data, Streams and Heterogeneous Source Mining: Algorithms, Systems, Programming Models and Applications
Hi-index | 0.00 |
Data stream clustering is an important task in data stream mining. In this paper, we propose SDStream, a new method for performing density-based data streams clustering over sliding windows. SDStream adopts CluStream clustering framework. In the online component, the potential core-microcluster and outlier micro-cluster structures are introduced to maintain the potential clusters and outliers. They are stored in the form of Exponential Histogram of Cluster Feature (EHCF) in main memory and are maintained by the maintenance of EHCFs. Outdated micro-clusters which need to be deleted are found by the value of t in Temporal Cluster Feature (TCF). In the offline component, the final clusters of arbitrary shape are generated according to all the potential core-micro-clusters maintained online by DBSCAN algorithm. Experimental results show that SDStream which can generate clusters of arbitrary shape has a much higher clustering quality than CluStream which generates spherical clusters.