Introduction to Data Mining, (First Edition)
Introduction to Data Mining, (First Edition)
A framework for clustering evolving data streams
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
A framework for projected clustering of high dimensional data streams
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
A Grid and Density-Based Clustering Algorithm for Processing Data Stream
WGEC '08 Proceedings of the 2008 Second International Conference on Genetic and Evolutionary Computing
Incremental clustering of dynamic data streams using connectivity based representative points
Data & Knowledge Engineering
FlockStream: A Bio-Inspired Algorithm for Clustering Evolving Data Streams
ICTAI '09 Proceedings of the 2009 21st IEEE International Conference on Tools with Artificial Intelligence
C-DenStream: Using Domain Knowledge on a Data Stream
DS '09 Proceedings of the 12th International Conference on Discovery Science
The ClusTree: indexing micro-clusters for anytime stream mining
Knowledge and Information Systems
An incremental data stream clustering algorithm based on dense units detection
PAKDD'05 Proceedings of the 9th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining
Hi-index | 0.00 |
The challenge of clustering on data stream is the ability to deal with the continuous incoming data which are unlimited and unable to store all of them. To manage the storage crisis, the data must be processed in a single pass or only once after the arrival and are thrown away outer. All previously clustered data must be mathematically captured in terms of group features since those data are already non-existent. The proposed data stream clustering algorithm is divided into two main phases, namely on-line and off-line. In the on-line phase, new micro-cluster features are proposed. Our micro-cluster features better represent the arriving data than the traditional micro-cluster features. In the off-line phase, the prepared micro-clusters are categorized by their densities. The proposed method can generate the final clusters with different shapes and densities. Based on entropy, purity, Jaccard coefficient, and Rand statistic measures, our algorithm being applied on synthetic and real data outperforms the other previous data stream clustering algorithms.