Optimal Replica Placement Strategy for Hierarchical Data Grid Systems

Authors:
Pangfeng Liu;Jan-Jan Wu
Affiliations:
National Taiwan University, Taiwan;Academia Sincia, Taiwan
Venue:
CCGRID '06 Proceedings of the Sixth IEEE International Symposium on Cluster Computing and the Grid
Year:
2006

Citing 0
Cited 5

Optimal replica placement in hierarchical Data Grids with locality assurance

Journal of Parallel and Distributed Computing
A model to predict the optimal performance of the Hierarchical Data Grid

Future Generation Computer Systems
Branch replication scheme: A new model for data replication in large scale data grids

Future Generation Computer Systems
Optimizing server placement in distributed systems in the presence of competition

Journal of Parallel and Distributed Computing
Quality of experience in distributed databases

Distributed and Parallel Databases

Quantified Score

Hi-index	0.00

Visualization

Abstract

Grid computing is an important mechanism for utilizing distributed computing resources. These resources are distributed in different geographical locations, but are organized to provide an integrated service. In order to speed up data access efficiency data grid systems replicate essential data in multiple locations, so that a user can access the data from a site in his vicinity. This paper studies replica placement in Data Grid systems, taking into account several important issues described below. First, the replicas should be placed in proper server locations so that the workload on each server is balanced. Second, we choose the optimal number of replicas to balance the data access efficiency, and the expensive maintenance costs for multiple copies of data. Clearly, optimizing access cost of data requests and reducing the cost of replication are two conflicting goals. Finding a good balance between them is a challenging task. We propose efficient algorithms for selecting optimal locations for placing the replicas so that the workload among these replica is balanced. Also when given the data usage from each user site and the maximum workload allowed for each replica server, our algorithm efficiently determines the minimum number of replicas required, as well as their locations.