Communication operations on coarse-grained mesh architectures
Parallel Computing
An effective hash-based algorithm for mining association rules
SIGMOD '95 Proceedings of the 1995 ACM SIGMOD international conference on Management of data
Beyond market baskets: generalizing association rules to correlations
SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
Efficiently mining long patterns from databases
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Efficient mining of emerging patterns: discovering trends and differences
KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Mining frequent patterns without candidate generation
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
A tree projection algorithm for generation of frequent item sets
Journal of Parallel and Distributed Computing - Special issue on high-performance data mining
A fast distributed algorithm for mining association rules
DIS '96 Proceedings of the fourth international conference on on Parallel and distributed information systems
Parallel data mining for association rules on shared memory systems
Knowledge and Information Systems
Parallel and Distributed Association Mining: A Survey
IEEE Concurrency
Parallel Mining of Association Rules
IEEE Transactions on Knowledge and Data Engineering
Scalable Parallel Data Mining for Association Rules
IEEE Transactions on Knowledge and Data Engineering
ICDE '97 Proceedings of the Thirteenth International Conference on Data Engineering
ICDE '95 Proceedings of the Eleventh International Conference on Data Engineering
A Pattern Decomposition (PD) Algorithm for Finding All Frequent Patterns in Large Datasets
ICDM '01 Proceedings of the 2001 IEEE International Conference on Data Mining
Fast Parallel Association Rule Mining without Candidacy Generation
ICDM '01 Proceedings of the 2001 IEEE International Conference on Data Mining
Scalable Techniques for Mining Causal Structures
VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Fast Algorithms for Mining Association Rules in Large Databases
VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Rule Generation With the Pattern Repository
ICAIS '02 Proceedings of the 2002 IEEE International Conference on Artificial Intelligence Systems (ICAIS'02)
A sampling-based framework for parallel data mining
Proceedings of the tenth ACM SIGPLAN symposium on Principles and practice of parallel programming
Toward terabyte pattern mining: an architecture-conscious solution
Proceedings of the 12th ACM SIGPLAN symposium on Principles and practice of parallel programming
Parallel TID-based frequent pattern mining algorithm on a PC Cluster and grid computing system
Expert Systems with Applications: An International Journal
Tidset-based parallel FP-tree algorithm for the frequent pattern mining problem on PC clusters
GPC'08 Proceedings of the 3rd international conference on Advances in grid and pervasive computing
International Journal of Ad Hoc and Ubiquitous Computing
Static load balancing of parallel mining of frequent itemsets using reservoir sampling
MLDM'11 Proceedings of the 7th international conference on Machine learning and data mining in pattern recognition
A distributed recommender system architecture
International Journal of Web Engineering and Technology
Load balancing approach parallel algorithm for frequent pattern mining
PaCT'07 Proceedings of the 9th international conference on Parallel Computing Technologies
Parallel frequent itemset mining using systolic arrays
Knowledge-Based Systems
Efficient algorithms for frequent pattern mining in many-task computing environments
Knowledge-Based Systems
Hi-index | 0.00 |
Extraction of frequent patterns in transaction-oriented database is crucial to several data mining tasks such as association rule generation, time series analysis, classification, etc. Most of these mining tasks require multiple passes over the database and if the database size is large, which is usually the case, scalable high performance solutions involving multiple processors are required. This paper presents an efficient scalable parallel algorithm for mining frequent patterns on parallel shared nothing platforms. The proposed algorithm is based on one of the best known sequential techniques referred to as Frequent Pattern (FP) Growth algorithm. Unlike most of the earlier parallel approaches based on different variants of the Apriori Algorithm, the algorithm presented in this paper does not explicitly result in having entire counting data structure duplicated on each processor. Furthermore, the proposed algorithm introduces minimum communication (and hence synchronization) overheads by efficiently partitioning the list of frequent elements list over processors. The experimental results show scalable performance over different machine and problem sizes. The comparison of implementation results with existing parallel approaches show significant gains in the speedup. On an 8-processor machine, we report an average speedup of 6 for different problem sizes.