A Robust and Efficient Message Passing Library for Volunteer Computing Environments

  • Authors:
  • Rakhi Anand;Troy Leblanc;Edgar Gabriel;Jaspal Subhlok

  • Affiliations:
  • Department of Computer Science, University of Houston, Houston, USA 77204;Department of Computer Science, University of Houston, Houston, USA 77204;Department of Computer Science, University of Houston, Houston, USA 77204;Department of Computer Science, University of Houston, Houston, USA 77204

  • Venue:
  • Journal of Grid Computing
  • Year:
  • 2011

Quantified Score

Hi-index 0.02

Visualization

Abstract

The objective of this research is to convert ordinary idle PCs into virtual clusters for executing parallel applications. The paper presents VolpexMPI that is designed to enable seamless forward application progress in the presence of frequent node failures as well as dynamically changing networks and node execution speeds. Process replication is employed to provide robustness. The central challenge in the design of VolpexMPI is to efficiently and automatically manage dynamically varying number of process replicas in different states of execution progress. The key fault tolerance technique employed is fully distributed sender based logging. The paper presents the design and an implementation of VolpexMPI. Preliminary results validate that the overhead of providing robustness is modest for applications with a favorable ratio of communication to computation and a low degree of communication.