Ridge: combining reliability and performance in open grid platforms

  • Authors:
  • Krishnaveni Budati;Jason Sonnek;Abhishek Chandra;Jon Weissman

  • Affiliations:
  • University of Minnesota;University of Minnesota;University of Minnesota;University of Minnesota

  • Venue:
  • Proceedings of the 16th international symposium on High performance distributed computing
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

Large-scale donation-based distributed infrastructures need to cope with the inherent unreliability of participant nodes. A widely-used work scheduling technique in such environments is to redundantly schedule the out sourced computations to a number of nodes. We present the design and implementation of RIDGE, a reliability aware system which uses a node's prior performance and behavior to make more effective scheduling decisions. We have implemented RIDGE on top of the BOINC distributed computing infrastructure and have evaluated its performance on a live test bed consisting of 120 PlanetLab nodes. Our experimental results show that RIDGE is able to match or surpass the throughput of the best vanilla BOINC configuration under different reliability environments, by automatically adapting to the characteristics of the underlying environment. In addition, RIDGE is able to provide much lower work unit makes pans compared to BOINC, which indicates its desirability in service-oriented environments with time constraints.