Hash Partitioned apriori in Parallel and Distributed Data Mining Environment with Dynamic Data Allocation Approach

  • Authors:
  • Sujni Paul;V. Saravanan

  • Affiliations:
  • -;-

  • Venue:
  • ICCSIT '08 Proceedings of the 2008 International Conference on Computer Science and Information Technology
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Parallel system is mainly composed of parallel algorithms which are cost optimal. In this paper a parallel algorithm the Hash Partitioned Apriori (HPA) is taken into consideration. HPA partitions the candidate itemsets among processors using a hash function, like the hash join in relational databases. HPA effectively utilizes the whole memory space of all the processors, hence it works well for large scale data mining in a parallel and distributed environment. The optimization technique of dynamic data allocation is discussed for the execution of this application. This technique is applied in a parallel and distributed environment. Writing parallel data mining algorithms in a distributed environment is a non-trivial task. The main purpose of the proposed method is to meet certain challenges associated with parallel and distributed data mining such as i) minimizing I/O ii) Increasing processing speed iii) Communication cost.