Hash Partitioned apriori in Parallel and Distributed Data Mining Environment with Dynamic Data Allocation Approach

Authors:
Sujni Paul;V. Saravanan
Affiliations:
-;-
Venue:
ICCSIT '08 Proceedings of the 2008 International Conference on Computer Science and Information Technology
Year:
2008

Citing 0
Cited 2

An improved Apriori-based algorithm for association rules mining

FSKD'09 Proceedings of the 6th international conference on Fuzzy systems and knowledge discovery - Volume 2
Efficient mining of frequent itemsets in social network data based on MapReduce framework

Proceedings of the 2013 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining

Quantified Score

Hi-index	0.00

Visualization

Abstract

Parallel system is mainly composed of parallel algorithms which are cost optimal. In this paper a parallel algorithm the Hash Partitioned Apriori (HPA) is taken into consideration. HPA partitions the candidate itemsets among processors using a hash function, like the hash join in relational databases. HPA effectively utilizes the whole memory space of all the processors, hence it works well for large scale data mining in a parallel and distributed environment. The optimization technique of dynamic data allocation is discussed for the execution of this application. This technique is applied in a parallel and distributed environment. Writing parallel data mining algorithms in a distributed environment is a non-trivial task. The main purpose of the proposed method is to meet certain challenges associated with parallel and distributed data mining such as i) minimizing I/O ii) Increasing processing speed iii) Communication cost.