Dimensionality reduction for similarity searching in dynamic databases
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Models and issues in data stream systems
Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Haar Wavelets for Efficient Similarity Search of Time-Series: With and Without Time Warping
IEEE Transactions on Knowledge and Data Engineering
Streaming-Data Algorithms for High-Quality Clustering
ICDE '02 Proceedings of the 18th International Conference on Data Engineering
Systematic data selection to mine concept-drifting data streams
Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
On demand classification of data streams
Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
On Change Diagnosis in Evolving Data Streams
IEEE Transactions on Knowledge and Data Engineering
A Unified Framework for Monitoring Data Streams in Real Time
ICDE '05 Proceedings of the 21st International Conference on Data Engineering
A framework for clustering evolving data streams
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
A framework for projected clustering of high dimensional data streams
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Hi-index | 0.00 |
In this paper, we propose a framework supporting clustering over different portions of continuous data streams at all possible time points. The framework is divided into two phases. Online statistics maintenance phase provides an approximation method for online statistics collection and a compact multi-resolution hierarchy for statistics maintenance. Once a clustering request is submitted, offline clustering phase abstracts statistics for approximating the user desired subsequences as precisely as possible from statistics hierarchies, and outputs the results of clustering over these statistics. Our performance experiments over real and synthetic data sets illustrate the effectiveness, efficiency of our approach.