Squeezing more CPU performance out of a Cray-2 by Vector block scheduling

  • Authors:
  • C. Eisenbeis;W. Jalby;A. Lichnewsky

  • Affiliations:
  • I.N.R.T.A.;Domaine de Voluceau;78153 Le Chesnay CEDEX

  • Venue:
  • Proceedings of the 1988 ACM/IEEE conference on Supercomputing
  • Year:
  • 1988

Quantified Score

Hi-index 0.00

Visualization

Abstract

Compile time scheduling of vector activities on the CRAY 21 is studied using a simplified model of the vector instruction stream. Due to several of the hardware characteristics of the machine, an approach using much know-how obtained on Array-Processor micro-code scheduling by the authors is shown practical. It calls for a pass of loop scheduling followed by a pass of resource allocation. Actual benchmarks of the resulting code are shown, exhibiting speed-ups as large as 50% over the current CFT77 compiler. Our results also give a new perspective in the comparison of vector chaining and non-chaining processor architectures.