Replicated data management in the grid: the Re:GRIDiT approach

  • Authors:
  • Laura Cristiana Voicu;Heiko Schuldt;Yuri Breitbart;Hans-Jörg Schek

  • Affiliations:
  • University of Basel, Basel, Switzerland;University of Basel, Basel, Switzerland;Kent University, Kent, OH, Switzerland;ETH Zurich, Zurich, Switzerland

  • Venue:
  • Proceedings of the 1st ACM workshop on Data grids for eScience
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

Grid environments more and more target novel domains such as eScience, eHealth or digital libraries that feature a variety of data-intensive applications. Consequently, issues related to data management in Grids are becoming increasingly important. In terms of data management, the Grid allows keeping a large number of replicas of data objects, possibly with different versions or levels of freshness, to allow for a high degree of availability, reliability and performance so as to best meet the needs of users and applications. At the same time, the seamless integration of replication management into the Grid while taking into account its special characteristics, needs to be done without any central component for managing data or metadata. In this paper, we report on the ongoing Re:GRIDiT project which aims at addressing all the above requirements. Re:GRIDiT distinguishes between potentially many updateable and read-only replicas which can be distributed across a Grid environment. First, Re:GRIDiT provides new protocols for the correct synchronization of concurrent updates to different updateable replicas and their subsequent propagation in a completely distributed way. Second, Re:GRIDiT takes into account the semantics of the data which is managed in the Grid: mutable data can be subject to updates; immutable data, in turn, cannot be changed once created, but may be subject to version control. Third, Re:GRIDiT will be dynamic in a way that according to the current load, new replicas (updateable or read-only) can be created or removed on demand. Fourth, Re:GRIDiT will provide read-only transactions the full flexibility to specify the freshness (for mutable data) or version number (for immutable data) -- which is particularly useful in order to trade accuracy for performance in the access to data in the Grid.