Data Sharing Pattern Aware Scheduling on Grids

  • Authors:
  • Young Choon Lee;Albert Y. Zomaya

  • Affiliations:
  • The University of Sydney, Australia;The University of Sydney, Australia

  • Venue:
  • ICPP '06 Proceedings of the 2006 International Conference on Parallel Processing
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

These days an increasing number of applications, especially in science and engineering, are dealing with a massive amount of data; hence they are dataintensive. Bioinformatics, data-mining and image processing are some typical areas of data-intensive applications. Such applications tend to be deployed on grids that provide powerful processing capabilities at reasonable cost. One fundamental scheduling issue, that arises when exploiting grids with these types of applications, is the minimization of data transfer. Therefore, the use of an efficient scheduling scheme that takes into account data transfers is rather essential in order to achieve both a shorter application completion time and efficient system utilization. In this paper, a novel scheduling algorithm, called the Shared Input data based Listing (SIL) algorithm for dataintensive bag-of-tasks (DBoT) applications in grid environments is proposed. The algorithm uses a set of task lists that are constructed taking the data sharing pattern into account and that are reorganized dynamically, based on performance of resources, during the execution of the application. The primary goal of this dynamic listing is to minimize data transfer, thus leading to shortening the overall completion time of DBoT applications. SIL further attempts to reduce serious schedule increases by adopting task duplication. In our evaluation study extensive simulation tests with three different types of the DBoT application model have been conducted. Based on the experimental results, SIL noticeably outperforms two previously proposed algorithms in schedule length.