Constrained itemset mining on a sequence of incoming data blocks

  • Authors:
  • Elena Baralis;Tania Cerquitelli;Silvia Chiusano

  • Affiliations:
  • Politecnico di Torino, Dipartmento di Automatica e Informatica, Coroso Duca Degli Abruzzi, 24, 10129 Torino, Italy;Politecnico di Torino, Dipartmento di Automatica e Informatica, Coroso Duca Degli Abruzzi, 24, 10129 Torino, Italy;Politecnico di Torino, Dipartmento di Automatica e Informatica, Coroso Duca Degli Abruzzi, 24, 10129 Torino, Italy

  • Venue:
  • International Journal of Intelligent Systems
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Many real-life databases are updated by means of incoming business information. In these databases (e.g., transactional data from large retail chains, call-detail records), the content evolves through periodical insertions (or deletions) of data blocks. Since data evolve over time, algorithms have to be devised to incrementally update data mining models. This paper presents a novel index, called I-Forest, to support itemset mining on incoming data blocks, where new blocks are inserted periodically, or old blocks are discarded. The I-Forest structure provides a complete data representation and allows different kind of analyses (e.g., investigate quarterly data), besides supporting user-defined time and support constraints. The I-Forest index has been implemented into the PostgreSQL open source DBMS and exploits its physical level access methods. Experiments, run for both sparse and dense data distributions, show the effectiveness of the I-Forest-based approach to perform itemset mining with both time and support constraints. The execution time of the I-Forest-based itemset mining technique is often faster than the Prefix-Tree algorithm accessing static data on flat files. © 2010 Wiley Periodicals, Inc.