Evolutionary clustering

  • Authors:
  • Deepayan Chakrabarti;Ravi Kumar;Andrew Tomkins

  • Affiliations:
  • Yahoo! Research, Sunnyvale, CA;Yahoo! Research, Sunnyvale, CA;Yahoo! Research, Sunnyvale, CA

  • Venue:
  • Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

We consider the problem of clustering data over time. An evolutionary clustering should simultaneously optimize two potentially conflicting criteria: first, the clustering at any point in time should remain faithful to the current data as much as possible; and second, the clustering should not shift dramatically from one timestep to the next. We present a generic framework for this problem, and discuss evolutionary versions of two widely-used clustering algorithms within this framework: k-means and agglomerative hierarchical clustering. We extensively evaluate these algorithms on real data sets and show that our algorithms can simultaneously attain both high accuracy in capturing today's data, and high fidelity in reflecting yesterday's clustering.