A single pass algorithm for clustering evolving data streams based on swarm intelligence

  • Authors:
  • Agostino Forestiero;Clara Pizzuti;Giandomenico Spezzano

  • Affiliations:
  • National Research Council of Italy---CNR, Rende (CS), Italy 87036;National Research Council of Italy---CNR, Rende (CS), Italy 87036;National Research Council of Italy---CNR, Rende (CS), Italy 87036

  • Venue:
  • Data Mining and Knowledge Discovery
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

Existing density-based data stream clustering algorithms use a two-phase scheme approach consisting of an online phase, in which raw data is processed to gather summary statistics, and an offline phase that generates the clusters by using the summary data. In this article we propose a data stream clustering method based on a multi-agent system that uses a decentralized bottom-up self-organizing strategy to group similar data points. Data points are associated with agents and deployed onto a 2D space, to work simultaneously by applying a heuristic strategy based on a bio-inspired model, known as flocking model. Agents move onto the space for a fixed time and, when they encounter other agents into a predefined visibility range, they can decide to form a flock if they are similar. Flocks can join to form swarms of similar groups. This strategy allows to merge the two phases of density-based approaches and thus to avoid the computing demanding offline cluster computation, since a swarm represents a cluster. Experimental results show that the bio-inspired approach can obtain very good results on real and synthetic data sets.