GTS: parallelization and vectorization of tight recurrences

  • Authors:
  • E. Ayguadé;J. Labarta;J. Torres;P. Borensztejn

  • Affiliations:
  • Departament d'Arquitectura de Computadors, Universitat Politècnica de Catalunya, Pau Gargallo, 5, 08028-Barcelona SPAIN;Departament d'Arquitectura de Computadors, Universitat Politècnica de Catalunya, Pau Gargallo, 5, 08028-Barcelona SPAIN;Departament d'Arquitectura de Computadors, Universitat Politècnica de Catalunya, Pau Gargallo, 5, 08028-Barcelona SPAIN;Departament d'Arquitectura de Computadors, Universitat Politècnica de Catalunya, Pau Gargallo, 5, 08028-Barcelona SPAIN

  • Venue:
  • Proceedings of the 1989 ACM/IEEE conference on Supercomputing
  • Year:
  • 1989

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper we present a new method for extracting the maximum parallelism or vector operations out of DO loops with tight recurrences using sequential programming languages. We have named the method Graph Traverse Scheduling (GTS). It is devised to produce code for shared memory multiprocessors or vector machines. When parallelizing, hardware support for fast synchronization is assumed.The method is presented for single nested loops including one or several recurrences and we show how parallel and vector code is generated. Based on the dependence graph of a loop, we first evaluate its parallelism and vector length of statements. Then we apply GTS to distribute loop iterations between tasks or to generate vector operations of a given length. When this method is applied for parallel code generation, dependencies not included in the sequential execution of each task must be explicitly synchronized. A method to minimize the number of explicit synchronizations is also presented. We also present how to compute the synchronization-free parallelism obtaining fully independent tasks.When GTS is applied for vector code generation, a sequential loop of vector operations is obtained.