Ridge: combining reliability and performance in open grid platforms

Authors:
Krishnaveni Budati;Jason Sonnek;Abhishek Chandra;Jon Weissman
Affiliations:
University of Minnesota;University of Minnesota;University of Minnesota;University of Minnesota
Venue:
Proceedings of the 16th international symposium on High performance distributed computing
Year:
2007

Citing 14
Cited 5

SETI@home: an experiment in public-resource computing

Communications of the ACM
Uncheatable Distributed Computations

CT-RSA 2001 Proceedings of the 2001 Conference on Topics in Cryptology: The Cryptographer's Track at RSA
Sabotage-Tolerance Mechanisms for Volunteer Computing Systems

CCGRID '01 Proceedings of the 1st International Symposium on Cluster Computing and the Grid
Uncheatable Grid Computing

ICDCS '04 Proceedings of the 24th International Conference on Distributed Computing Systems (ICDCS'04)
BOINC: A System for Public-Resource Computing and Storage

GRID '04 Proceedings of the 5th IEEE/ACM International Workshop on Grid Computing
Resource Management for Rapid Application Turnaround on Enterprise Desktop Grids

Proceedings of the 2004 ACM/IEEE conference on Supercomputing
Fourth IEEE International Symposium on Network Computing and Applications - Cover

NCA '05 Proceedings of the Fourth IEEE International Symposium on Network Computing and Applications
Cluster Computing and Grid 2005 Works in Progress

IEEE Distributed Systems Online
Result Verification and Trust-Based Scheduling in Peer-to-Peer Grids

P2P '05 Proceedings of the Fifth IEEE International Conference on Peer-to-Peer Computing
Metrics for Effective Resource Management in Global Computing Environments

E-SCIENCE '05 Proceedings of the First International Conference on e-Science and Grid Computing
Automatic methods for predicting machine availability in desktop Grid and peer-to-peer systems

CCGRID '04 Proceedings of the 2004 IEEE International Symposium on Cluster Computing and the Grid
Reputation-Based Scheduling on Unreliable Distributed Infrastructures

ICDCS '06 Proceedings of the 26th IEEE International Conference on Distributed Computing Systems
Operating system support for planetary-scale network services

NSDI'04 Proceedings of the 1st conference on Symposium on Networked Systems Design and Implementation - Volume 1
Modeling machine availability in enterprise and wide-area distributed computing environments

Euro-Par'05 Proceedings of the 11th international Euro-Par conference on Parallel Processing

On correlated availability in Internet-distributed systems

GRID '08 Proceedings of the 2008 9th IEEE/ACM International Conference on Grid Computing
Supporting fault-tolerance for time-critical events in distributed environments

Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis
Supporting fault-tolerance for time-critical events in distributed environments

Scientific Programming
Decentralized Resource Availability Prediction for a Desktop Grid

CCGRID '10 Proceedings of the 2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing
Decentralized approach to resource availability prediction using group availability in a P2P desktop grid

Future Generation Computer Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

Large-scale donation-based distributed infrastructures need to cope with the inherent unreliability of participant nodes. A widely-used work scheduling technique in such environments is to redundantly schedule the out sourced computations to a number of nodes. We present the design and implementation of RIDGE, a reliability aware system which uses a node's prior performance and behavior to make more effective scheduling decisions. We have implemented RIDGE on top of the BOINC distributed computing infrastructure and have evaluated its performance on a live test bed consisting of 120 PlanetLab nodes. Our experimental results show that RIDGE is able to match or surpass the throughput of the best vanilla BOINC configuration under different reliability environments, by automatically adapting to the characteristics of the underlying environment. In addition, RIDGE is able to provide much lower work unit makes pans compared to BOINC, which indicates its desirability in service-oriented environments with time constraints.