Block interaction: a generative summarization scheme for frequent patterns

  • Authors:
  • Ruoming Jin;Yang Xiang;Hui Hong;Kun Huang

  • Affiliations:
  • Kent State University, Kent, OH;The Ohio State University, Columbus, OH;Kent State University, Kent, OH;The Ohio State University, Columbus, OH

  • Venue:
  • Proceedings of the ACM SIGKDD Workshop on Useful Patterns
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Frequent pattern mining is an essential tool in the data miner's toolbox, with data applications running the gamut from itemsets, sequences, trees, to graphs and topological structures. Despite its importance, a major issue has clouded the frequent pattern mining methodology: the number of frequent patterns can easily become too large to be analyzed and used. Though many efforts have tried to tackle this issue, it remains to be an open problem. In this paper, we propose a novel block-interaction model to answer this call. This model can help summarize a collection of frequent itemsets and provide accurate support information using only a small number of frequent itemsets. At the heart of our approach is a set of core blocks, each of which is the Cartesian product of a frequent itemset and its support transactions. Those core blocks interact with each other through two basic operators (horizontal union and vertical union) to form the complexity of frequent patterns. Each frequent itemset can be expressed and its frequency can be accurately recovered through the combination of these core blocks. This is also the first complete generative model for describing the formation of frequent patterns. Specifically, we relate the problem of finding a minimal block-interaction model to a generalized set-cover problem, referred to as the graph set cover (GSC) problem. We develop an efficient algorithm based on GSC to discover the core blocks. A detailed experimental evaluation demonstrates the effectiveness of our approach.