Combining Multiple Interrelated Streams for Incremental Clustering

  • Authors:
  • Zaigham Faraz Siddiqui;Myra Spiliopoulou

  • Affiliations:
  • Otto-von-Guericke-University of Magdeburg, Magdeburg, Germany 39106;Otto-von-Guericke-University of Magdeburg, Magdeburg, Germany 39106

  • Venue:
  • SSDBM 2009 Proceedings of the 21st International Conference on Scientific and Statistical Database Management
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

Many data mining applications analyze structured data that span across many tables and accumulate in time. Incremental mining methods have been devised to adapt patterns to new tuples. However, they have been designed for data in one table only. We propose a method for incremental clustering on multiple interrelated streams - a "multi-table stream ": its components are streams that reference each other, arrive at different speeds and have attributes of a priori unknown value ranges. Our approach encompasses solutions for the maintenance of cach-es and sliding windows over the individual streams, the propagation of foreign keys across streams, the transformation of all streams into a single-table stream, and an incremental clustering algorithm that operates over that stream. We evaluate our method on two real datasets and show that it approximates well the performance of an ideal method that possesses unlimited resources and knows the future.