Clustering Data Streams: Theory and Practice
IEEE Transactions on Knowledge and Data Engineering
Streaming-Data Algorithms for High-Quality Clustering
ICDE '02 Proceedings of the 18th International Conference on Data Engineering
Density-based clustering for real-time stream data
Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
A framework for clustering evolving data streams
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
A framework for projected clustering of high dimensional data streams
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
SS'08 Proceedings of the 17th conference on Security symposium
The Journal of Machine Learning Research
Precise anytime clustering of noisy sensor data with logarithmic complexity
Proceedings of the Fifth International Workshop on Knowledge Discovery from Sensor Data
IEEE Transactions on Pattern Analysis and Machine Intelligence
The ClusTree: indexing micro-clusters for anytime stream mining
Knowledge and Information Systems
Error bounds for convolutional codes and an asymptotically optimum decoding algorithm
IEEE Transactions on Information Theory
A single pass algorithm for clustering evolving data streams based on swarm intelligence
Data Mining and Knowledge Discovery
Hi-index | 0.00 |
The main paradigm for clustering evolving data streams in the last 10 years has been to divide the clustering process into an online phase that computes and stores detailed statistics about the data in micro-clusters and an offline phase that queries micro-cluster statistics and returns desired clustering structures. The argument for two-phase algorithms is that they support evolving data streams and temporal multi-scale analysis, which single pass algorithms do not. In this paper, we describe a single pass fully online trellis-based algorithm, named ClusTrel, designed for centroid-based clustering that supports evolving data streams and generates clustering structures right after a new point is processed. The performance of ClusTrel is assessed and compared to state of the art algorithms for clustering of data streams showing similar performance with smaller memory footprint.