Continuous Trend-Based Clustering in Data Streams

  • Authors:
  • Maria Kontaki;Apostolos N. Papadopoulos;Yannis Manolopoulos

  • Affiliations:
  • Department of Informatics, Aristotle University, Thessaloniki, Greece 54124;Department of Informatics, Aristotle University, Thessaloniki, Greece 54124;Department of Informatics, Aristotle University, Thessaloniki, Greece 54124

  • Venue:
  • DaWaK '08 Proceedings of the 10th international conference on Data Warehousing and Knowledge Discovery
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Trend analysis of time series is an important problem since trend identification enables the prediction of the near future. In streaming time series the problem is more challenging due to the dynamic nature of the data. In this paper, we propose a method to continuously clustering a number of streaming time series based on their trend characteristics. Each streaming time series is transformed to a vector by means of the Piecewise Linear Approximation (PLA) technique. The PLA vector comprises pairs of values (timestamp, trend) denoting the starting time of the trend and the type of the trend (either UP or DOWN) respectively. A distance metric for PLA vectors is introduced. We propose split and merge criteria to continuously update the clustering information. Moreover, the proposed method handles outliers. Performance evaluation results, based on real-life and synthetic data sets, show the efficiency and scalability of the proposed scheme.