Fast sequential and parallel algorithms for association rule mining: a comparison
Fast sequential and parallel algorithms for association rule mining: a comparison
An effective hash-based algorithm for mining association rules
SIGMOD '95 Proceedings of the 1995 ACM SIGMOD international conference on Management of data
Asynchronous parallel algorithm for mining association rules on a shared-memory multi-processors
Proceedings of the tenth annual ACM symposium on Parallel algorithms and architectures
Using association rules for product assortment decisions: a case study
KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Mining association rules with multiple minimum supports
KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Mining frequent patterns without candidate generation
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
File structures using hashing functions
Communications of the ACM
Parallel data mining for association rules on shared-memory multi-processors
Supercomputing '96 Proceedings of the 1996 ACM/IEEE conference on Supercomputing
Real world performance of association rule algorithms
Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
ICDE '95 Proceedings of the Eleventh International Conference on Data Engineering
Fast Algorithms for Mining Association Rules in Large Databases
VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Enhancing the Apriori Algorithm for Frequent Set Counting
DaWaK '01 Proceedings of the Third International Conference on Data Warehousing and Knowledge Discovery
Real World Association Rule Mining
BNCOD 19 Proceedings of the 19th British National Conference on Databases: Advances in Databases
New Algorithms for Fast Discovery of Association Rules
New Algorithms for Fast Discovery of Association Rules
State of the art of graph-based data mining
ACM SIGKDD Explorations Newsletter
Advances in frequent itemset mining implementations: report on FIMI'03
ACM SIGKDD Explorations Newsletter - Special issue on learning from imbalanced datasets
IEEE Transactions on Knowledge and Data Engineering
Introduction to Data Mining, (First Edition)
Introduction to Data Mining, (First Edition)
Fastest association rule mining algorithm predictor (FARM-AP)
Proceedings of The Fourth International C* Conference on Computer Science and Software Engineering
Hi-index | 0.00 |
Frequent-itemset mining is an important part of data mining. It is a computational and memory intensive task and has a large number of scientific and statistical application areas. In many of them, the datasets can easily grow up to tens or even several hundred gigabytes of data. Hence, efficient algorithms are required to process such amounts of data. In the recent years, there have been proposed many efficient sequential mining algorithms, which however cannot exploit current and future systems providing large degrees of parallelism. Contrary, the number of parallel frequent-itemset mining algorithms is rather small and most of them do not scale well as the number of threads is largely increased. In this paper, we present a highly-scalable mining algorithm that is based on the well-known Apriori algorithm; it is optimized for processing very large datasets on multiprocessor systems. The key idea of pcApriori is to employ a modified producer--consumer processing scheme, which partitions the data during processing and distributes it to the available threads. We conduct many experiments on large datasets. pcApriori scales almost linear on our test system comprising 32 cores.