Efficient parallel data mining for association rules
CIKM '95 Proceedings of the fourth international conference on Information and knowledge management
Fast discovery of association rules
Advances in knowledge discovery and data mining
A localized algorithm for parallel association mining
Proceedings of the ninth annual ACM symposium on Parallel algorithms and architectures
Parallel mining algorithms for generalized association rules with classification hierarchy
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
A fast distributed algorithm for mining association rules
DIS '96 Proceedings of the fourth international conference on on Parallel and distributed information systems
Mining Very Large Databases with Parallel Processing
Mining Very Large Databases with Parallel Processing
Parallel Algorithms for Discovery of Association Rules
Data Mining and Knowledge Discovery
Parallel and Distributed Association Mining: A Survey
IEEE Concurrency
Efficient Mining of Association Rules in Distributed Databases
IEEE Transactions on Knowledge and Data Engineering
Parallel Mining of Association Rules
IEEE Transactions on Knowledge and Data Engineering
Effect of Data Skewness in Parallel Mining of Association Rules
PAKDD '98 Proceedings of the Second Pacific-Asia Conference on Research and Development in Knowledge Discovery and Data Mining
Communication-Efficient Distributed Mining of Association Rules
Data Mining and Knowledge Discovery
ODAM: An Optimized Distributed Association Rule Mining Algorithm
IEEE Distributed Systems Online
Mining Multiple Level Non-redundant Association Rules through Two-Fold Pruning of Redundancies
MLDM '09 Proceedings of the 6th International Conference on Machine Learning and Data Mining in Pattern Recognition
A highly parallel algorithm for frequent itemset mining
MCPR'10 Proceedings of the 2nd Mexican conference on Pattern recognition: Advances in pattern recognition
Hi-index | 0.00 |
The search for frequent patterns in transactional databases is considered one of the most important data mining problems. Several parallel and sequential algorithms have been proposed in the literature to solve this problem. Almost all of these algorithms make repeated passes over the dataset to determine the set of frequent itemsets, thus implying high I/O overhead. In the parallel case, most algorithms perform a sum-reduction at the end of each pass to construct the global counts, also implying high synchronization cost. We present a novel algorithm that exploits efficiently the trade-offs between computation, communication, memory usage and synchronization. The algorithm was implemented over a cluster of SMP nodes combining distributed and shared memory paradigms. This paper presents the results of our algorithm on different data sizes experimented on different numbers of processors, and studies the effect of these variations on the overall performance.