Sabotage-Tolerance Mechanisms for Volunteer Computing Systems
CCGRID '01 Proceedings of the 1st International Symposium on Cluster Computing and the Grid
XtremWeb: A Generic Global Computing System
CCGRID '01 Proceedings of the 1st International Symposium on Cluster Computing and the Grid
Entropia: architecture and performance of an enterprise desktop grid system
Journal of Parallel and Distributed Computing - Special issue on computational grids
BOINC: A System for Public-Resource Computing and Storage
GRID '04 Proceedings of the 5th IEEE/ACM International Workshop on Grid Computing
IPDPS '05 Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Workshop 1 - Volume 02
The Computational and Storage Potential of Volunteer Computing
CCGRID '06 Proceedings of the Sixth IEEE International Symposium on Cluster Computing and the Grid
BitDew: a programmable environment for large-scale data management and distribution
Proceedings of the 2008 ACM/IEEE conference on Supercomputing
GPC '09 Proceedings of the 4th International Conference on Advances in Grid and Pervasive Computing
Collusion Detection for Grid Computing
CCGRID '09 Proceedings of the 2009 9th IEEE/ACM International Symposium on Cluster Computing and the Grid
GridBot: execution of bags of tasks in multiple grids
Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis
Generalized Spot-Checking for Sabotage-Tolerance in Volunteer Computing Systems
CCGRID '10 Proceedings of the 2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing
Modeling and tolerating heterogeneous failures in large parallel systems
Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis
DISC'11 Proceedings of the 25th international conference on Distributed computing
Hi-index | 0.00 |
Desktop grids use the free resources in Intranet and Internet environments for large-scale computation and storage. While desktop grids offer a high return on investment, one critical issue is the validation of results returned by participating hosts. Several mechanisms for result validation have been previously proposed. However, the characterization of errors is poorly understood. To study error rates, we implemented and deployed a desktop grid application across several thousand hosts distributed over the Internet. We then analyzed the results to give quantitative and empirical characterization of errors stemming from input or output (I/O) failures. We find that in practice, error rates are widespread across hosts but occur relatively infrequently. Moreover, we find that error rates tend to not be stationary over time nor correlated between hosts. In light of these characterization results, we evaluated state-of-the-art error detection mechanisms and describe the trade-offs for using each mechanism.