Branch replication scheme: A new model for data replication in large scale data grids

  • Authors:
  • José M. Pérez;Félix García-Carballeira;Jesús Carretero;Alejandro Calderón;Javier Fernández

  • Affiliations:
  • Computer Architecture Group, Computer Science Department, Universidad Carlos III de Madrid, Leganes, Madrid, Spain;Computer Architecture Group, Computer Science Department, Universidad Carlos III de Madrid, Leganes, Madrid, Spain;Computer Architecture Group, Computer Science Department, Universidad Carlos III de Madrid, Leganes, Madrid, Spain;Computer Architecture Group, Computer Science Department, Universidad Carlos III de Madrid, Leganes, Madrid, Spain;Computer Architecture Group, Computer Science Department, Universidad Carlos III de Madrid, Leganes, Madrid, Spain

  • Venue:
  • Future Generation Computer Systems
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Data replication is a practical and effective method to achieve efficient and fault-tolerant data access in grids. Traditionally, data replication schemes maintain an entire replica in each site where a file is replicated, providing a read-only model. These solutions require huge storage resources to store the whole set of replicas and do not allow efficient data modification to avoid the consistency problem. In this paper we propose a new replication method, called the Branch Replication Scheme (BRS), that provides three main advantages over traditional approaches: optimizing storage usage, by creating subreplicas; increasing data access performance, by applying parallel I/O techniques; and providing the possibility to modify the replicas, by maintaining consistency among updates in an efficient way. An analytical model of the replication scheme, naming system, and replica updating scheme are formally described in the paper. Using this model, operations such as reading, writing, or updating a replica are analyzed. Simulation results demonstrate the feasibility of BRS, as they show that the new replication algorithm increases data access performance, compared with popular replication schemes such as hierarchical and server-directed replication, which are commonly used in current data grids.