A framework for flexible clustering of multiple evolving data streams

Authors:
Wei Fan;Toyohide Watanabe;Koichi Asakura
Affiliations:
Department of Systems and Social Informatics, Graduate School of Information Science, Nagoya University, Furo-cho, Chikusa-ku, Nagoya 464-8603, Japan.;Department of Systems and Social Informatics, Graduate School of Information Science, Nagoya University, Furo-cho, Chikusa-ku, Nagoya 464-8603, Japan.;School of Informatics, Daido Institute of Technology, 10-3, Takiharu-cho, Minami-ku, Nagoya 457-8530, Japan
Venue:
International Journal of Advanced Intelligence Paradigms
Year:
2008

Citing 10
Cited 0

Dimensionality reduction for similarity searching in dynamic databases

SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Models and issues in data stream systems

Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Haar Wavelets for Efficient Similarity Search of Time-Series: With and Without Time Warping

IEEE Transactions on Knowledge and Data Engineering
Streaming-Data Algorithms for High-Quality Clustering

ICDE '02 Proceedings of the 18th International Conference on Data Engineering
Systematic data selection to mine concept-drifting data streams

Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
On demand classification of data streams

Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
On Change Diagnosis in Evolving Data Streams

IEEE Transactions on Knowledge and Data Engineering
A Unified Framework for Monitoring Data Streams in Real Time

ICDE '05 Proceedings of the 21st International Conference on Data Engineering
A framework for clustering evolving data streams

VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
A framework for projected clustering of high dimensional data streams

VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper, we propose a framework supporting clustering over different portions of continuous data streams at all possible time points. The framework is divided into two phases. Online statistics maintenance phase provides an approximation method for online statistics collection and a compact multi-resolution hierarchy for statistics maintenance. Once a clustering request is submitted, offline clustering phase abstracts statistics for approximating the user desired subsequences as precisely as possible from statistics hierarchies, and outputs the results of clustering over these statistics. Our performance experiments over real and synthetic data sets illustrate the effectiveness, efficiency of our approach.