Online and offline trend cluster discovery in spatially distributed data streams

  • Authors:
  • Anna Ciampi;Annalisa Appice;Donato Malerba

  • Affiliations:
  • Dipartimento di Informatica, Università degli Studi di Bari Aldo Moro, Bari - Italy;Dipartimento di Informatica, Università degli Studi di Bari Aldo Moro, Bari - Italy;Dipartimento di Informatica, Università degli Studi di Bari Aldo Moro, Bari - Italy

  • Venue:
  • MSM'10/MUSE'10 Proceedings of the 2010 international conference on Analysis of social media and ubiquitous data
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Emerging real life applications, such as environmental compliance, ecological studies and meteorology, are characterized by real-time data acquisition through remote sensor networks. The most important aspect of the sensor readings is that they comprise a space dimension and a time dimension which are both information bearing. Additionally, they usually arrive at a rapid rate in a continuous, unbounded stream. Streaming prevents us from storing all readings and performing multiple scans of the entire data set. The drift of data distribution poses the additional problem of mining patterns which may change over the time. We address these challenges for the trend cluster cluster discovery, that is, the discovery of clusters of spatially close sensors which transmit readings, whose temporal variation, called trend polyline, is similar along the time horizon of a window. We present a stream framework which segments the stream into equally-sized windows, computes online intra-window trend clusters and stores these trend clusters in a database. Trend clusters are queried offline at any time, to determine trend clusters along larger windows (i.e. windows of windows). Experiments with several streams demonstrate the effectiveness of the proposed framework in discovering accurate and relevant to human trend clusters.