Trust but verify: monitoring remotely executing programs for progress and correctness

  • Authors:
  • Shuo Yang;Ali R. Butt;Y. Charlie Hu;Samuel P. Midkiff

  • Affiliations:
  • Purdue University, West Lafayette, IN;Purdue University, West Lafayette, IN;Purdue University, West Lafayette, IN;Purdue University, West Lafayette, IN

  • Venue:
  • Proceedings of the tenth ACM SIGPLAN symposium on Principles and practice of parallel programming
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

The increased popularity of grid systems and cycle sharing across organizations requires scalable systems that provide facilities to locate resources, to be fair in the use of those resources, and to monitor jobs executing on remote systems. This paper describes the GridCop system which allows a computation on a remote, and potentially fraudulent, host system to be monitored for progress and execution correctness. A novel feature of our system is that it constructs cooperating submitter and host programs from the original program, and these programs allow both progress and execution correctness to be monitored with negligible overhead while providing protection against common fraudulent behaviors. Experimental results show that the overhead of this monitoring is low on both the submitting and host machines. We describe compiler algorithms that allow the required monitoring code to be automatically generated.