LCM over ZBDDS: fast generation of very large-scale frequent itemsets using a compact graph-based representation

Authors:
Shin-Ichi Minato;Takeaki Uno;Hiroki Arimura
Affiliations:
Graduate School of Information Science and Technology, Hokkaido University, Sapporo, Japan;National Institute of Informatics, Tokyo, Japan;Graduate School of Information Science and Technology, Hokkaido University, Sapporo, Japan
Venue:
PAKDD'08 Proceedings of the 12th Pacific-Asia conference on Advances in knowledge discovery and data mining
Year:
2008

Citing 10
Cited 1

Graph-Based Algorithms for Boolean Function Manipulation

IEEE Transactions on Computers
Zero-suppressed BDDs for set manipulation in combinatorial problems

DAC '93 Proceedings of the 30th international Design Automation Conference
Mining association rules between sets of items in large databases

SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
Scalable Algorithms for Association Mining

IEEE Transactions on Knowledge and Data Engineering
Mining Frequent Patterns without Candidate Generation: A Frequent-Pattern Tree Approach

Data Mining and Knowledge Discovery
Efficient Method of Combinatorial Item Set Analysis Based on Zero-Suppressed BDDs

WIRI '05 Proceedings of the International Workshop on Challenges in Web Information Retrieval and Integration
LCM ver.3: collaboration of array, bitmap and prefix tree for frequent itemset mining

Proceedings of the 1st international workshop on open source data mining: frequent pattern mining implementations
Fast mining of high dimensional expressive contrast patterns using zero-suppressed binary decision diagrams

Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Frequent pattern mining and knowledge indexing based on zero-suppressed BDDs

KDID'06 Proceedings of the 5th international conference on Knowledge discovery in inductive databases
Symmetric item set mining based on zero-suppressed BDDs

DS'06 Proceedings of the 9th international conference on Discovery Science

Knowledge Compilation for Itemset Mining

Proceedings of the 2010 conference on ECAI 2010: 19th European Conference on Artificial Intelligence

Quantified Score

Hi-index	0.00

Visualization

Abstract

Frequent itemset mining is one of the fundamental techniques for data mining and knowledge discovery. In the last decade, a number of efficient algorithms have been presented for frequent itemset mining, but most of them focused on only enumerating the itemsets that satisfy the given conditions, and how to store and index the mining result in order to ensure an efficient data analysis is a different matter. In this paper, we propose a fast algorithm for generating very large-scale all/closed/maximal frequent itemsets using Zero-suppressed BDDs (ZBDDs), a compact graph-based data structure. Our method, "LCM over ZBDDs," is based on one of the most efficient state-of-the-art algorithms proposed thus far. Not only does it enumerate/list the itemsets, but it also generates a compact output data structure on the main memory. The result can be efficiently postprocessed by using algebraic ZBDD operations. The original LCM is known as an output linear time algorithm, but our new method requires a sub-linear time for the number of frequent patterns when the ZBDD-based data compression works well. Our method will greatly accelerate the data mining process and this will leads to a new style of on-memory processing for dealing with knowledge discovery problems.