An efficient itemset mining approach for data streams

  • Authors:
  • Elena Baralis;Tania Cerquitelli;Silvia Chiusano;Alberto Grand;Luigi Grimaudo

  • Affiliations:
  • Politecnico di Torino, Dipartimento di Automatica e Informatica, Torino, Italy;Politecnico di Torino, Dipartimento di Automatica e Informatica, Torino, Italy;Politecnico di Torino, Dipartimento di Automatica e Informatica, Torino, Italy;Politecnico di Torino, Dipartimento di Automatica e Informatica, Torino, Italy;Politecnico di Torino, Dipartimento di Automatica e Informatica, Torino, Italy

  • Venue:
  • KES'11 Proceedings of the 15th international conference on Knowledge-based and intelligent information and engineering systems - Volume Part II
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper presents a new approach to efficiently discovering correlations among data items on a sequence of incoming data windows. The approach enables both on-line (e.g., mining only the most recent data) and off-line (e.g., analyzing aggregate data windows) queries, besides supporting user-defined item and support constraints. Given a sequence of transactional data windows and a minimum support threshold, for each of the most recent data windows a projection is compactly stored in main-memory, including all items that have been frequently observed in the last windows. Users can easily perform constrained itemset extraction either from a single data window or from multiple ones. A summary of interesting itemsets mined from all available data is generated on a regular basis and compactly stored in a persistent data structure, to efficiently support further analysis (e.g., investigate only a selected past data window). Experimental results obtained on both real and synthetic data streams show the effectiveness and the efficiency of the proposed approach in mining interesting itemsets by means of both on-line and off-line queries.