Measuring the Performance and Reliability of Production Computational Grids

  • Authors:
  • Omid Khalili;Jiahua He;Catherine Olschanowsky;Allan Snavely;Henri Casanova

  • Affiliations:
  • Dept. of Computer Science and Engineering, University of California, San Diego, 9500 Gilman Dr, La Jolla, CA 92093-0505, USA. okhalili@cs.ucsd.edu;Dept. of Computer Science and Engineering, University of California, San Diego, 9500 Gilman Dr, La Jolla, CA 92093-0505, USA. j2he@cs.ucsd.edu;San Diego Supercomputer Center, 9500 Gilman Dr, La Jolla, CA 92093-0505, USA. cmills@sdsc.edu;San Diego Supercomputer Center, 9500 Gilman Dr, La Jolla, CA 92093-0505, USA. allans@sdsc.edu;Dept. of Information and Computer Sciences, University of Hawai'i at Manoa, 1680 East-West, Rd, Honolulu, HI 96822, USA. henric@hawaii.edu

  • Venue:
  • GRID '06 Proceedings of the 7th IEEE/ACM International Conference on Grid Computing
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this work we report on data gathered via a deployment of a monitoring and benchmarking infrastructure on two production grid platforms, TeraGrid and Geon. Our result show that these production grids are rather unavailable, with success rates for benchmark and application runs between 55% and 80%. We also found that performance fluctuation was in the 50% range, expectedly mostly due to batch schedulers. We also investigate whether the execution time of a typical grid application can be predicated based on previous runs of simple benchmarks. Perhaps surprisingly, we find that application execution time can be predicted with a relative error as low as 9%.