Integration of scheduling and replication in data grids

  • Authors:
  • Anirban Chakrabarti;R. A. Dheepak;Shubhashis Sengupta

  • Affiliations:
  • Software Engineering and Technology Laboratory, Infosys Technologies Ltd, Bangalore, India;Software Engineering and Technology Laboratory, Infosys Technologies Ltd, Bangalore, India;Software Engineering and Technology Laboratory, Infosys Technologies Ltd, Bangalore, India

  • Venue:
  • HiPC'04 Proceedings of the 11th international conference on High Performance Computing
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

Data Grids seek to harness geographically distributed resources for large-scale data-intensive problems Such problems involve loosely coupled jobs and large data sets distributed remotely Data Grids have found applications in scientific research fields of high-energy physics, life sciences etc as well as in the enterprises The issues that need to be considered in the Data Grid research area include resource management for computation and data Computation management comprises scheduling of jobs, scalability, and response time; while data management includes replication and movement of data at selected sites As jobs are data intensive, data management issues often become integral to the problems of scheduling and effective resource management in the Data Grids The paper deals with the problem of integrating the scheduling and replication strategies As part of the solution, we have proposed an Integrated Replication and Scheduling Strategy (IRS) which aims at an iterative improvement of the performance based on the coupling between the scheduling and replication strategies Results suggest that, in the context of our experiments, IRS performs better than several well-known replication strategies.