Mining distributed evolving data streams using fractal GP ensembles

  • Authors:
  • Gianluigi Folino;Clara Pizzuti;Giandomenico Spezzano

  • Affiliations:
  • Institute for High Performance Computing and Networking, CNR, ICAR, Rende, CS, Italy;Institute for High Performance Computing and Networking, CNR, ICAR, Rende, CS, Italy;Institute for High Performance Computing and Networking, CNR, ICAR, Rende, CS, Italy

  • Venue:
  • EuroGP'07 Proceedings of the 10th European conference on Genetic programming
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

A Genetic Programming based boosting ensemble method for the classification of distributed streaming data is proposed. The approach handles flows of data coming from multiple locations by building a global model obtained by the aggregation of the local models coming from each node. A main characteristics of the algorithm presented is its adaptability in presence of concept drift. Changes in data can cause serious deterioration of the ensemble performance. Our approach is able to discover changes by adopting a strategy based on self-similarity of the ensemble behavior, measured by its fractal dimension, and to revise itself by promptly restoring classification accuracy. Experimental results on a synthetic data set show the validity of the approach in maintaining an accurate and up-to-date GP ensemble.