Parallel mining algorithms for generalized association rules with classification hierarchy
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Asynchronous parallel algorithm for mining association rules on a shared-memory multi-processors
Proceedings of the tenth annual ACM symposium on Parallel algorithms and architectures
Dynamic skew handling in parallel mining of association rules
Proceedings of the seventh international conference on Information and knowledge management
Dynamic remote memory acquisition for parallel data mining on ATM-connected PC cluster
ICS '99 Proceedings of the 13th international conference on Supercomputing
High performance data mining (tutorial PM-3)
Tutorial notes of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
SC '97 Proceedings of the 1997 ACM/IEEE conference on Supercomputing
Web community mining and web log mining: commodity cluster based execution
ADC '02 Proceedings of the 13th Australasian database conference - Volume 5
Effect of Data Distribution in Parallel Mining of Associations
Data Mining and Knowledge Discovery
An Adaptive Algorithm for Mining Association Rules on Shared-Memory Parallel Machines
Distributed and Parallel Databases
Parallel and Distributed Association Mining: A Survey
IEEE Concurrency
Scalable Parallel Data Mining for Association Rules
IEEE Transactions on Knowledge and Data Engineering
Effect of Data Skewness and Workload Balance in Parallel Data Mining
IEEE Transactions on Knowledge and Data Engineering
HiPC '01 Proceedings of the 8th International Conference on High Performance Computing
An efficient association mining implementation on clusters of SMP
IPDPS '01 Proceedings of the 15th International Parallel & Distributed Processing Symposium
Parallel Data Mining on ATM-Connected PC Cluster and Optimization of Its Execution Environments
IPDPS '00 Proceedings of the 15 IPDPS 2000 Workshops on Parallel and Distributed Processing
Parallel Data Mining on Large Scale PC Cluster
WAIM '00 Proceedings of the First International Conference on Web-Age Information Management
Dynamic Load Balancing for Parallel Association Rule Mining on Heterogenous PC Cluster Systems
VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
Considering Main Memory in Mining Association Rules
DaWaK '99 Proceedings of the First International Conference on Data Warehousing and Knowledge Discovery
Mining of Association Rules in Very Large Databases: A Structured Parallel Approach
Euro-Par '99 Proceedings of the 5th International Euro-Par Conference on Parallel Processing
Parallel and Distributed Data Mining: An Introduction
Revised Papers from Large-Scale Parallel Data Mining, Workshop on Large-Scale Parallel KDD Systems, SIGKDD
Efficient Parallel Algorithms for Mining Associations
Revised Papers from Large-Scale Parallel Data Mining, Workshop on Large-Scale Parallel KDD Systems, SIGKDD
Parallel Sequence Mining on Shared-Memory Machines
Revised Papers from Large-Scale Parallel Data Mining, Workshop on Large-Scale Parallel KDD Systems, SIGKDD
DEXA '02 Proceedings of the 13th International Conference on Database and Expert Systems Applications
Forecasting Association Rules Using Existing Data Sets
IEEE Transactions on Knowledge and Data Engineering
Parallel tree-projection-based sequence mining algorithms
Parallel Computing
A Super-Programming Approach for Mining Association Rules in Parallel on PC Clusters
IEEE Transactions on Parallel and Distributed Systems
Distributed approximate mining of frequent patterns
Proceedings of the 2005 ACM symposium on Applied computing
Partitioning strategies for distributed association rule mining
The Knowledge Engineering Review
Parallel mining of association rules from text databases
The Journal of Supercomputing
Performance evaluation of the distributed association rule mining algorithms
SEPADS'05 Proceedings of the 4th WSEAS International Conference on Software Engineering, Parallel & Distributed Systems
Webservices oriented data mining in knowledge architecture
Future Generation Computer Systems
A generalized parallel algorithm for frequent itemset mining
ICCOMP'08 Proceedings of the 12th WSEAS international conference on Computers
Research works on cluster computing and storage area network
Proceedings of the 3rd International Conference on Ubiquitous Information Management and Communication
Parallel FP-growth on PC cluster
PAKDD'03 Proceedings of the 7th Pacific-Asia conference on Advances in knowledge discovery and data mining
Data mining on desktop grid platforms
PPAM'07 Proceedings of the 7th international conference on Parallel processing and applied mathematics
Tree partition based parallel frequent pattern mining on shared memory systems
IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
An efficient distributed algorithm for mining association rules
ISPA'06 Proceedings of the 4th international conference on Parallel and Distributed Processing and Applications
Apriori-based frequent itemset mining algorithms on MapReduce
Proceedings of the 6th International Conference on Ubiquitous Information Management and Communication
Parallel approaches to machine learning-A comprehensive survey
Journal of Parallel and Distributed Computing
Hi-index | 0.00 |
In this paper, we propose four parallel algorithms (NPA, SPA, HPA and HPA-ELD) for mining association rules on shared-nothing parallel machines to improve its performance.In NPA, candidate itemsets are just copied amongst all the processors which can lead to memory overflow for large transaction databases. The remaining three algorithms partition the candidate itemsets over the processors. If it is partitioned simply (SPA), transaction data has to be braodcast to all processors. HPA partitions the candidate itemsets using a hash function to eliminate broadcasting, which also reduces the comparison workload significantly. HPA-ELD fully utilizes the available memory space by detecting the extremely large itemsets and copying them, which is also very effective at flattering the load over the processors.We implemented these algorithms in a shared-nothing environment. Performance evaluations show that the best algorithm, HPA-ELD, attains good linearity on speedup ratio and is effective for handling skew.