Efficient synchronization of replicated data in distributed systems

  • Authors:
  • Thorsten Schütt;Florian Schintke;Alexander Reinefeld

  • Affiliations:
  • Zuse Institute Berlin;Zuse Institute Berlin;Zuse Institute Berlin

  • Venue:
  • ICCS'03 Proceedings of the 1st international conference on Computational science: PartI
  • Year:
  • 2003

Quantified Score

Hi-index 0.01

Visualization

Abstract

We present nsync, a tool for synchronizing large replicated data sets in distributed systems. nsync computes nearly optimal synchronization plans based on a hierarchy of gossip algorithms that take the network topology into account. Our primary design goals were maximum performance and maximum scalability. We achieved these goals by exploiting parallelism in the planning and the synchronization phase, by omitting transfer of unnecessary metadata, by synchronizing at a block level rather than a file level, and by using sophisticated compression methods. With its relaxed consistency semantic, nsync neither needs a master copy nor a quorum for updating distributed replicas. Each replica is kept as an autonomous entity and can be modified with the usual tools.