Polycyclic Vector scheduling vs. Chaining on 1-Port Vector supercomputers

  • Authors:
  • J. H. Tang;E. S. Davidson;J. Tong

  • Affiliations:
  • Department of Electrical Engineering, and Computer Science, University of Michigan, Ann Arbor, MI;Department of Electrical Engineering, and Computer Science, University of Michigan, Ann Arbor, MI;The Center for Advanced Computer Studies, University of Southwestern Louisiana, Lafayette, Louisiana

  • Venue:
  • Proceedings of the 1988 ACM/IEEE conference on Supercomputing
  • Year:
  • 1988

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper studies the impact of chaining and several instruction scheduling schemes on one-memory-port vector supercomputers, illustrated by the Cray-1 and Cray-2. The lack of instruction chaining in the Cray-2 vector processor requires a different instruction scheduling scheme from that of the Cray-1. Situations are characterized in which simple vector scheduling can generate optimal code, which fully utilizes at least one functional unit for machines with chaining. With enough registers polycyclic scheduling, even without chaining, guarantees full utilization of one functional unit, after an initial transient, for loops with acyclic dependence graphs. Workloads are represented by vectorizable Livermore Fortran Kernels (LFKs). The effectiveness of applying polycyclic scheduling to the Cray-2 is compared with optimal simple vector scheduling on the Cray-1. The speedup of polycyclic vector scheduling on the Cray-2 over the schedule achieved by the current CFT77 compiler on several vectorizable LFKs is also presented.