Replica selection strategies in data grid

  • Authors:
  • Rashedur M. Rahman;Reda Alhajj;Ken Barker

  • Affiliations:
  • Department of Computer Science, University of Calgary, Calgary, Alberta, Canada;Department of Computer Science, University of Calgary, Calgary, Alberta, Canada and Department of Computer Science, Global University, Beirut, Lebanon;Department of Computer Science, University of Calgary, Calgary, Alberta, Canada

  • Venue:
  • Journal of Parallel and Distributed Computing
  • Year:
  • 2008

Quantified Score

Hi-index 0.01

Visualization

Abstract

Replication in Data Grids reduces access latency and bandwidth consumption. When different sites hold replicas of datasets, there is a significant benefit realized by selecting the best replica. By selecting the best replica, the access latency can be minimized. In this research, we propose two different replica selection techniques. To select the best replica from information gathered locally, a simple technique called the k-Nearest Neighbor (KNN) rule is exploited. The KNN rule selects the best replica for a file by considering previous file transfer logs indicating the history of the file and those nearby. We also propose a predictive technique to estimate the transfer time between sites. The predicted transfer time can be used as an estimate of transfer bandwidth of different sites that hold replica currently, and help in selecting the best replica among different sites. Simulation results demonstrate that the k-nearest algorithm shows a significant performance improvement over the traditional replica catalog based model. Besides, the neural network predictive technique estimates the transfer time among sites more accurately than the multi-regression model.