Conventional Benchmarks as a Sample of the Performance Spectrum

  • Authors:
  • John L. Gustafson;Rajat Todi

  • Affiliations:
  • Ames Laboratory, USDOEgus@ameslab.gov;Ames Laboratory, USDOEtodi@scl.ameslab.gov

  • Venue:
  • The Journal of Supercomputing
  • Year:
  • 1999

Quantified Score

Hi-index 0.00

Visualization

Abstract

Most benchmarks are smaller than actual application programs. One reason is to improve benchmark universality by demanding resources every computer is likely to have. However, users dynamically increase the size of application programs to match the power available, whereas most benchmarks are static and of a size appropriate for computers available when the benchmark was created; this is particularly true for parallel computers. Thus, the benchmark overstates computer performance, since smaller problems spend more time in cache. Scalable benchmarks, such as HINT, examine the full spectrum of performance through various memory regimes, and express a superset of the information given by any particular fixed-size benchmark. Using 5,000 experimental measurements, we have found that performance on the NAS Parallel Benchmarks, SPEC, LINPACK, and other benchmarks is predicted accurately by subsets of HINT performance curve. Correlations are typically better than 0.995. Predicted ranking is often perfect.