Frequent Pattern Mining on Message Passing Multiprocessor Systems

Authors:
Asif Javed;Ashfaq Khokhar
Affiliations:
University of Illinois at Chicago, USA;University of Illinois at Chicago, USA. ashfaq@eecs.uic.edu
Venue:
Distributed and Parallel Databases
Year:
2004

Citing 19
Cited 10

Communication operations on coarse-grained mesh architectures

Parallel Computing
An effective hash-based algorithm for mining association rules

SIGMOD '95 Proceedings of the 1995 ACM SIGMOD international conference on Management of data
Beyond market baskets: generalizing association rules to correlations

SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
Efficiently mining long patterns from databases

SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Efficient mining of emerging patterns: discovering trends and differences

KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Mining frequent patterns without candidate generation

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
A tree projection algorithm for generation of frequent item sets

Journal of Parallel and Distributed Computing - Special issue on high-performance data mining
A fast distributed algorithm for mining association rules

DIS '96 Proceedings of the fourth international conference on on Parallel and distributed information systems
Parallel data mining for association rules on shared memory systems

Knowledge and Information Systems
Parallel and Distributed Association Mining: A Survey

IEEE Concurrency
Parallel Mining of Association Rules

IEEE Transactions on Knowledge and Data Engineering
Scalable Parallel Data Mining for Association Rules

IEEE Transactions on Knowledge and Data Engineering
Clustering Association Rules

ICDE '97 Proceedings of the Thirteenth International Conference on Data Engineering
Mining Sequential Patterns

ICDE '95 Proceedings of the Eleventh International Conference on Data Engineering
A Pattern Decomposition (PD) Algorithm for Finding All Frequent Patterns in Large Datasets

ICDM '01 Proceedings of the 2001 IEEE International Conference on Data Mining
Fast Parallel Association Rule Mining without Candidacy Generation

ICDM '01 Proceedings of the 2001 IEEE International Conference on Data Mining
Scalable Techniques for Mining Causal Structures

VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Fast Algorithms for Mining Association Rules in Large Databases

VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Rule Generation With the Pattern Repository

ICAIS '02 Proceedings of the 2002 IEEE International Conference on Artificial Intelligence Systems (ICAIS'02)

A sampling-based framework for parallel data mining

Proceedings of the tenth ACM SIGPLAN symposium on Principles and practice of parallel programming
Toward terabyte pattern mining: an architecture-conscious solution

Proceedings of the 12th ACM SIGPLAN symposium on Principles and practice of parallel programming
Parallel TID-based frequent pattern mining algorithm on a PC Cluster and grid computing system

Expert Systems with Applications: An International Journal
Tidset-based parallel FP-tree algorithm for the frequent pattern mining problem on PC clusters

GPC'08 Proceedings of the 3rd international conference on Advances in grid and pervasive computing
A novel parallel algorithm for frequent pattern mining with privacy preserved in cloud computing environments

International Journal of Ad Hoc and Ubiquitous Computing
Static load balancing of parallel mining of frequent itemsets using reservoir sampling

MLDM'11 Proceedings of the 7th international conference on Machine learning and data mining in pattern recognition
A distributed recommender system architecture

International Journal of Web Engineering and Technology
Load balancing approach parallel algorithm for frequent pattern mining

PaCT'07 Proceedings of the 9th international conference on Parallel Computing Technologies
Parallel frequent itemset mining using systolic arrays

Knowledge-Based Systems
Efficient algorithms for frequent pattern mining in many-task computing environments

Knowledge-Based Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

Extraction of frequent patterns in transaction-oriented database is crucial to several data mining tasks such as association rule generation, time series analysis, classification, etc. Most of these mining tasks require multiple passes over the database and if the database size is large, which is usually the case, scalable high performance solutions involving multiple processors are required. This paper presents an efficient scalable parallel algorithm for mining frequent patterns on parallel shared nothing platforms. The proposed algorithm is based on one of the best known sequential techniques referred to as Frequent Pattern (FP) Growth algorithm. Unlike most of the earlier parallel approaches based on different variants of the Apriori Algorithm, the algorithm presented in this paper does not explicitly result in having entire counting data structure duplicated on each processor. Furthermore, the proposed algorithm introduces minimum communication (and hence synchronization) overheads by efficiently partitioning the list of frequent elements list over processors. The experimental results show scalable performance over different machine and problem sizes. The comparison of implementation results with existing parallel approaches show significant gains in the speedup. On an 8-processor machine, we report an average speedup of 6 for different problem sizes.