Randomly sampling maximal itemsets

  • Authors:
  • Sandy Moens;Bart Goethals

  • Affiliations:
  • University of Antwerp, Belgium;University of Antwerp, Belgium

  • Venue:
  • Proceedings of the ACM SIGKDD Workshop on Interactive Data Exploration and Analytics
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

Pattern mining techniques generally enumerate lots of uninteresting and redundant patterns. To obtain less redundant collections, techniques exist that give condensed representations of these collections. However, the proposed techniques often rely on complete enumeration of the pattern space, which can be prohibitive in terms of time and memory. Sampling can be used to filter the output space of patterns without explicit enumeration. We propose a framework for random sampling of maximal itemsets from transactional databases. The presented framework can use any monotonically decreasing measure as interestingness criteria for this purpose. Moreover, we use an approximation measure to guide the search for maximal sets to different parts of the output space. We show in our experiments that the method can rapidly generate small collections of patterns with good quality. The sampling framework has been implemented in the interactive visual data mining tool called MIME1, as such enabling users to quickly sample a collection of patterns and analyze the results.