Performance-based data distribution for data mining applications on grid computing environments

  • Authors:
  • Wen-Chung Shih;Chao-Tung Yang;Shian-Shyong Tseng

  • Affiliations:
  • Department of Information Science and Applications, Asia University, Taichung, Taiwan, ROC 41354;High-Performance Computing Laboratory, Department of Computer Science, Tunghai University, Taichung, Taiwan, ROC 40704;Department of Information Science and Applications, Asia University, Taichung, Taiwan, ROC 41354 and Department of Computer and Information Science, National Chiao Tung University, Hsinchu, Taiwan ...

  • Venue:
  • The Journal of Supercomputing
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Effective data distribution techniques can significantly reduce the total execution time of a program on grid computing environments, especially for data mining applications. In this paper, we describe a linear programming formulation for the data distribution problem on grids. Furthermore, a heuristic method, named Heuristic Data Distribution Scheme (HDDS), is proposed to solve this problem. We implement two types of data mining applications, Association Rule Mining and Decision Tree Construction, and conduct experiments on grid testbeds. Experimental results show that data mining programs using the proposed HDDS to distribute data could execute more efficiently than traditional schemes could.