Enhanced Dynamic Hierarchical Replication and Weighted Scheduling Strategy in Data Grid

  • Authors:
  • Najme Mansouri;Gholam Hosein Dastghaibyfard

  • Affiliations:
  • Department of Computer Science & Engineering, Birjand University of Technology, Postal Code 97175-569, SanatoMadan Avenue, Birjand, Iran;Department of Computer Science & Engineering, College of Electrical & Computer Engineering, Shiraz University, MollaSadra Avenue, Shiraz, Iran

  • Venue:
  • Journal of Parallel and Distributed Computing
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

The Data Grid provides massive aggregated computing resources and distributed storage space to deal with data-intensive applications. Due to the limitation of available resources in the grid as well as production of large volumes of data, efficient use of the Grid resources becomes an important challenge. Data replication is a key optimization technique for reducing access latency and managing large data by storing data in a wise manner. Effective scheduling in the Grid can reduce the amount of data transferred among nodes by submitting a job to a node where most of the requested data files are available. In this paper two strategies are proposed, first a novel job scheduling strategy called Weighted Scheduling Strategy (WSS) that uses hierarchical scheduling to reduce the search time for an appropriate computing node. It considers the number of jobs waiting in a queue, the location of the required data for the job and the computing capacity of the sites Second, a dynamic data replication strategy, called Enhanced Dynamic Hierarchical Replication (EDHR) that improves file access time. This strategy is an enhanced version of the Dynamic Hierarchical Replication strategy. It uses an economic model for file deletion when there is not enough space for the replica. The economic model is based on the future value of a data file. Best replica placement plays an important role for obtaining maximum benefit from replication as well as reducing storage cost and mean job execution time. So, it is considered in this paper. The proposed strategies are implemented by OptorSim, the European Data Grid simulator. Experiment results show that the proposed strategies achieve better performance by minimizing the data access time and avoiding unnecessary replication.