A Super-Programming Approach for Mining Association Rules in Parallel on PC Clusters

  • Authors:
  • Dejiang Jin;Sotirios G. Ziavras

  • Affiliations:
  • -;IEEE

  • Venue:
  • IEEE Transactions on Parallel and Distributed Systems
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

PC clusters have become popular in parallel processing. They do not involve specialized interprocessor networks, so the latency of data communications is rather long. The programming models for PC clusters are often different than those for parallel machines or supercomputers containing sophisticated interprocessor communication networks. For PC clusters, load balancing among the nodes becomes a more critical issue in attempts to yield high performance. We introduce a new model for program development on PC clusters, namely, the Super-Programming Model (SPM). The workload is modeled as a collection of Super-Instructions (SIs). We propose that a set of SIs be designed for each application domain. They should constitute an orthogonal set of frequently used high-level operations in the corresponding application domain. Each SI should normally be implemented as a high-level language routine that can execute on any PC. Application programs are modeled as Super-Programs (SPs), which are coded using SIs. SIs are dynamically assigned to available PCs at runtime. Because of the known granularity of SIs, an upper bound on their execution time can be estimated at static time. Therefore, dynamic load balancing becomes an easier task. Our motivation is to support dynamic load balancing and code porting, especially for applications with diverse sets of inputs such as data mining. We apply here SPM to the implementation of an Apriori-like algorithm for mining association rules. Our experiments show that the average idle time per node is kept very low.