Demand-driven frequent itemset mining using pattern structures

  • Authors:
  • Haixun Wang;Chang-Shing Perng;Sheng Ma;Philip S. Yu

  • Affiliations:
  • IBM T.J. Watson Research Center, 10532, Hawthorne, NY, USA;IBM T.J. Watson Research Center, 10532, Hawthorne, NY, USA;IBM T.J. Watson Research Center, 10532, Hawthorne, NY, USA;IBM T.J. Watson Research Center, 10532, Hawthorne, NY, USA

  • Venue:
  • Knowledge and Information Systems
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

Frequent itemset mining aims at discovering patterns the supports of which are beyond a given threshold. In many applications, including network event management systems, which motivated this work, patterns are composed of items each described by a subset of attributes of a relational table. As it involves an exponential mining space, the efficient implementation of user preferences and mining constraints becomes the first priority for a mining algorithm. User preferences and mining constraints are often expressed using patterns’ attribute structures. Unlike traditional methods that mine all frequent patterns indiscriminately, we regard frequent itemset mining as a two-step process: the mining of the pattern structures and the mining of patterns within each pattern structure. In this paper, we present a novel architecture that uses pattern structures to organize the mining space. In comparison with the previous techniques, the advantage of our approach is two-fold: (i) by exploiting the interrelationships among pattern structures, execution times for mining can be reduced significantly; and (ii) more importantly, it enables us to incorporate high-level simple user preferences and mining constraints into the mining process efficiently. These advantages are demonstrated by our experiments using both synthetic and real-life datasets.