Block interaction: a generative summarization scheme for frequent patterns

Authors:
Ruoming Jin;Yang Xiang;Hui Hong;Kun Huang
Affiliations:
Kent State University, Kent, OH;The Ohio State University, Columbus, OH;Kent State University, Kent, OH;The Ohio State University, Columbus, OH
Venue:
Proceedings of the ACM SIGKDD Workshop on Useful Patterns
Year:
2010

Citing 27
Cited 1

Mining association rules between sets of items in large databases

SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
Efficiently mining long patterns from databases

SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
A threshold of ln n for approximating set cover

Journal of the ACM (JACM)
Free-Sets: A Condensed Representation of Boolean Data for the Approximation of Frequency Queries

Data Mining and Knowledge Discovery
Mining Sequential Patterns

ICDE '95 Proceedings of the Eleventh International Conference on Data Engineering
Discovering Frequent Closed Itemsets for Association Rules

ICDT '99 Proceedings of the 7th International Conference on Database Theory
Automatic Scheduler for Real-Time Vision Applications

IPDPS '01 Proceedings of the 15th International Parallel & Distributed Processing Symposium
Fast Algorithms for Mining Association Rules in Large Databases

VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Efficiently mining frequent trees in a forest

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Mining Top.K Frequent Closed Patterns without Minimum Support

ICDM '02 Proceedings of the 2002 IEEE International Conference on Data Mining
State of the art of graph-based data mining

ACM SIGKDD Explorations Newsletter
DBC: a condensed representation of frequent patterns for efficient mining

Information Systems
Approximating a collection of frequent sets

Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Geometric and combinatorial tiles in 0-1 data

PKDD '04 Proceedings of the 8th European Conference on Principles and Practice of Knowledge Discovery in Databases
Summarizing itemset patterns: a profile-based approach

Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
Mining compressed frequent-pattern sets

VLDB '05 Proceedings of the 31st international conference on Very large data bases
MAFIA: A Maximal Frequent Itemset Algorithm

IEEE Transactions on Knowledge and Data Engineering
Extracting redundancy-aware top-k patterns

Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Summarizing itemset patterns using probabilistic models

Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Non-derivable itemset mining

Data Mining and Knowledge Discovery
Frequent pattern mining: current status and future directions

Data Mining and Knowledge Discovery
Effective and efficient itemset pattern summarization: regression-based approaches

Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
Succinct summarization of transactional databases: an overlapped hyperrectangle scheme

Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
Cartesian contour: a concise representation for a collection of frequent sets

Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
CP-summary: a concise representation for browsing frequent itemsets

Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Closed non-derivable itemsets

PKDD'06 Proceedings of the 10th European conference on Principle and Practice of Knowledge Discovery in Databases
A survey on condensed representations for frequent sets

Proceedings of the 2004 European conference on Constraint-Based Mining and Inductive Databases

Transactional Database Transformation and Its Application in Prioritizing Human Disease Genes

IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)

Quantified Score

Hi-index	0.00

Visualization

Abstract

Frequent pattern mining is an essential tool in the data miner's toolbox, with data applications running the gamut from itemsets, sequences, trees, to graphs and topological structures. Despite its importance, a major issue has clouded the frequent pattern mining methodology: the number of frequent patterns can easily become too large to be analyzed and used. Though many efforts have tried to tackle this issue, it remains to be an open problem. In this paper, we propose a novel block-interaction model to answer this call. This model can help summarize a collection of frequent itemsets and provide accurate support information using only a small number of frequent itemsets. At the heart of our approach is a set of core blocks, each of which is the Cartesian product of a frequent itemset and its support transactions. Those core blocks interact with each other through two basic operators (horizontal union and vertical union) to form the complexity of frequent patterns. Each frequent itemset can be expressed and its frequency can be accurately recovered through the combination of these core blocks. This is also the first complete generative model for describing the formation of frequent patterns. Specifically, we relate the problem of finding a minimal block-interaction model to a generalized set-cover problem, referred to as the graph set cover (GSC) problem. We develop an efficient algorithm based on GSC to discover the core blocks. A detailed experimental evaluation demonstrates the effectiveness of our approach.