Generalised Repair for Overlay Networks

  • Authors:
  • Barry Porter;Francois Taiani;Geoff Coulson

  • Affiliations:
  • Lancaster University, Lancaster, UK;Lancaster University, Lancaster, UK;Lancaster University, Lancaster, UK

  • Venue:
  • SRDS '06 Proceedings of the 25th IEEE Symposium on Reliable Distributed Systems
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

We present and evaluate a generic approach to the repair of overlay networks which identifies general principles of overlay repair and embodies these as a reusable service. At the heart of our approach is an algorithm that discovers the extent of a failed section of any type of overlay, and assigns responsibility to carry out the repair. The repair strategy itself is 'pluggable' and can be tailored to the requirements of a specific overlay type or instance. Our approach is efficient in terms of the number of repair-related message exchanges it incurs; scalable in that it involves only nodes in the locality of the failed section of the overlay; and resilient in that it correctly handles cases in which multiple adjacent nodes fail simultaneously, and it tolerates new failures that occur while a repair is underway. The benefits of our approach are that: (i) it extracts and encapsulates best practice in repair for overlays; (ii) it simplifies the design and implementation of new overlays (because repair issues can be treated orthogonally to basic functionality); and (iii) it supports tailorable levels of dependability for overlays, including pluggable repair strategies.