Randomly sampling maximal itemsets

Authors:
Sandy Moens;Bart Goethals
Affiliations:
University of Antwerp, Belgium;University of Antwerp, Belgium
Venue:
Proceedings of the ACM SIGKDD Workshop on Interactive Data Exploration and Analytics
Year:
2013

Citing 17
Cited 0

Birthday paradox, coupon collectors, caching algorithms and self-organizing search

Discrete Applied Mathematics
Efficiently mining long patterns from databases

SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Mining frequent patterns without candidate generation

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Parallel Algorithms for Discovery of Association Rules

Data Mining and Knowledge Discovery
Information Visualization and Visual Data Mining

IEEE Transactions on Visualization and Computer Graphics
Discovering Frequent Closed Itemsets for Association Rules

ICDT '99 Proceedings of the 7th International Conference on Database Theory
Fast Algorithms for Mining Association Rules in Large Databases

VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Subgroup Discovery with CN2-SD

The Journal of Machine Learning Research
Approximating a collection of frequent sets

Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Interestingness measures for data mining: A survey

ACM Computing Surveys (CSUR)
Non-derivable itemset mining

Data Mining and Knowledge Discovery
ORIGAMI: A Novel and Effective Approach for Mining Representative Orthogonal Graph Patterns

Statistical Analysis and Data Mining
Output space sampling for graph patterns

Proceedings of the VLDB Endowment
Krimp: mining itemsets that compress

Data Mining and Knowledge Discovery
MIME: a framework for interactive visual pattern mining

Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
Linear space direct pattern sampling using coupling from the past

Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining
Summarizing data succinctly with the most informative itemsets

ACM Transactions on Knowledge Discovery from Data (TKDD) - Special Issue on the Best of SIGKDD 2011

Quantified Score

Hi-index	0.00

Visualization

Abstract

Pattern mining techniques generally enumerate lots of uninteresting and redundant patterns. To obtain less redundant collections, techniques exist that give condensed representations of these collections. However, the proposed techniques often rely on complete enumeration of the pattern space, which can be prohibitive in terms of time and memory. Sampling can be used to filter the output space of patterns without explicit enumeration. We propose a framework for random sampling of maximal itemsets from transactional databases. The presented framework can use any monotonically decreasing measure as interestingness criteria for this purpose. Moreover, we use an approximation measure to guide the search for maximal sets to different parts of the output space. We show in our experiments that the method can rapidly generate small collections of patterns with good quality. The sampling framework has been implemented in the interactive visual data mining tool called MIME1, as such enabling users to quickly sample a collection of patterns and analyze the results.