Finding minimum representative pattern sets

Authors:
Guimei Liu;Haojun Zhang;Limsoon Wong
Affiliations:
National University of Singapore, Singapore, Singapore;National University of Singapore, Singapore, Singapore;National University of Singapore, Singapore, Singapore
Venue:
Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining
Year:
2012

Citing 20
Cited 1

Mining association rules between sets of items in large databases

SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
Efficiently mining long patterns from databases

SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
The implementation and performance of compressed databases

ACM SIGMOD Record
A condensed representation to find frequent patterns

PODS '01 Proceedings of the twentieth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Free-Sets: A Condensed Representation of Boolean Data for the Approximation of Frequency Queries

Data Mining and Knowledge Discovery
Discovering Frequent Closed Itemsets for Association Rules

ICDT '99 Proceedings of the 7th International Conference on Database Theory
Mining All Non-derivable Frequent Itemsets

PKDD '02 Proceedings of the 6th European Conference on Principles of Data Mining and Knowledge Discovery
Mining Minimal Non-redundant Association Rules Using Frequent Closed Itemsets

CL '00 Proceedings of the First International Conference on Computational Logic
CLOSET+: searching for the best strategies for mining frequent closed itemsets

Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Approximating a collection of frequent sets

Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
TFP: An Efficient Algorithm for Mining Top-K Frequent Closed Itemsets

IEEE Transactions on Knowledge and Data Engineering
Summarizing itemset patterns: a profile-based approach

Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
Mining compressed frequent-pattern sets

VLDB '05 Proceedings of the 31st international conference on Very large data bases
Mining condensed frequent-pattern bases

Knowledge and Information Systems
Extracting redundancy-aware top-k patterns

Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Summarizing itemset patterns using probabilistic models

Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
CFP-tree: A compact disk-based structure for storing and querying frequent itemsets

Information Systems
Effective and efficient itemset pattern summarization: regression-based approaches

Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
CP-summary: a concise representation for browsing frequent itemsets

Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Tell me what i need to know: succinctly summarizing data with itemsets

Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining

Summarizing probabilistic frequent patterns: a fast approach

Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining

Quantified Score

Hi-index	0.00

Visualization

Abstract

Frequent pattern mining often produces an enormous number of frequent patterns, which imposes a great challenge on understanding and further analysis of the generated patterns. This calls for finding a small number of representative patterns to best approximate all other patterns. An ideal approach should 1) produce a minimum number of representative patterns; 2) restore the support of all patterns with error guarantee; and 3) have good efficiency. Few existing approaches can satisfy all the three requirements. In this paper, we develop two algorithms, MinRPset and FlexRPset, for finding minimum representative pattern sets. Both algorithms provide error guarantee. MinRPset produces the smallest solution that we can possibly have in practice under the given problem setting, and it takes a reasonable amount of time to finish. FlexRPset is developed based on MinRPset. It provides one extra parameter K to allow users to make a trade-off between result size and efficiency. Our experiment results show that MinRPset and FlexRPset produce fewer representative patterns than RPlocal---an efficient algorithm that is developed for solving the same problem. FlexRPset can be slightly faster than RPlocal when K is small.