Mining association rules between sets of items in large databases
SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
Mining Frequent Patterns without Candidate Generation: A Frequent-Pattern Tree Approach
Data Mining and Knowledge Discovery
Fast vertical mining using diffsets
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Fast Algorithms for Frequent Itemset Mining Using FP-Trees
IEEE Transactions on Knowledge and Data Engineering
ACM Computing Surveys (CSUR)
FIUT: A new method for mining frequent itemsets
Information Sciences: an International Journal
Mining top-k frequent items in a data stream with flexible sliding windows
Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
Approximation of Frequentness Probability of Itemsets in Uncertain Data
ICDM '10 Proceedings of the 2010 IEEE International Conference on Data Mining
Hi-index | 0.00 |
Efficient mining of frequent itemsets from a database plays an essential role in many data mining tasks such as association rule mining. Many algorithms use a prefix-tree to represent a database and mine frequent itemsets by constructing recursively conditional prefix-trees from the prefix-tree. A (conditional) prefix-tree can be stored in various structures. The construction and traversal costs of prefix-trees, or rather their storage structures, take a large proportion in the whole cost for such algorithms. The PatriciaMine algorithm employs a Patricia trie to store a prefix-tree and shows good performance. In this study, we introduce an efficient Patricia* structure for storing a prefix-tree. A Patricia* structure is more compact and contiguous than a corresponding Patricia trie, and thus the construction and traversal costs of the former are less than those of the latter. Previous prefix-tree-based algorithms adopt a similar mining procedure, in which most nodes in a prefix-tree are repeatedly accessed when the prefix-tree is processed. The paper presents a novel mining procedure in which node accesses for a prefix-tree are greatly reduced. We propose the PatriciaMine* algorithm that is the combination of the Patricia* structure with the proposed procedure. Experimental data show that PatriciaMine* outperforms not only PatriciaMine but also several fast algorithms, such as FPgrowth* and dEclat, for various databases.