An EM-Based Algorithm for Clustering Data Streams in Sliding Windows

  • Authors:
  • Xuan Hong Dang;Vincent Lee;Wee Keong Ng;Arridhana Ciptadi;Kok Leong Ong

  • Affiliations:
  • Monash University, Australia;Monash University, Australia;Nanyang Technological University, Singapore;Nanyang Technological University, Singapore;Deakin University, Australia

  • Venue:
  • DASFAA '09 Proceedings of the 14th International Conference on Database Systems for Advanced Applications
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

Cluster analysis has played a key role in data understanding. When such an important data mining task is extended to the context of data streams, it becomes more challenging since the data arrive at a mining system in one-pass manner. The problem is even more difficult when the clustering task is considered in a sliding window model which requiring the elimination of outdated data must be dealt with properly. We propose SWEM algorithm that exploits the Expectation Maximization technique to address these challenges. SWEM is not only able to process the stream in an incremental manner, but also capable to adapt to changes happened in the underlying stream distribution.