GTS: parallelization and vectorization of tight recurrences

Authors:
E. Ayguadé;J. Labarta;J. Torres;P. Borensztejn
Affiliations:
Departament d'Arquitectura de Computadors, Universitat Politècnica de Catalunya, Pau Gargallo, 5, 08028-Barcelona SPAIN;Departament d'Arquitectura de Computadors, Universitat Politècnica de Catalunya, Pau Gargallo, 5, 08028-Barcelona SPAIN;Departament d'Arquitectura de Computadors, Universitat Politècnica de Catalunya, Pau Gargallo, 5, 08028-Barcelona SPAIN;Departament d'Arquitectura de Computadors, Universitat Politècnica de Catalunya, Pau Gargallo, 5, 08028-Barcelona SPAIN
Venue:
Proceedings of the 1989 ACM/IEEE conference on Supercomputing
Year:
1989

Citing 7
Cited 2

on Parallel MIMD computation: HEP supercomputer and its applications

on Parallel MIMD computation: HEP supercomputer and its applications
Automatic translation of FORTRAN programs to vector form

ACM Transactions on Programming Languages and Systems (TOPLAS)
Synchronizing processors through memory requests in a tightly coupled multiprocessor

ISCA '88 Proceedings of the 15th Annual International Symposium on Computer architecture
Optimizing Supercompilers for Supercomputers

Optimizing Supercompilers for Supercomputers
Parallel Programming and Compilers

Parallel Programming and Compilers
Dependence graphs and compiler optimizations

POPL '81 Proceedings of the 8th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
GTS: Extracting Full Parallelism Out of DO Loops

PARLE '89 Proceedings of the Parallel Architectures and Languages Europe, Volume II: Parallel Languages

Performance Analysis of Parallelizing Compilers on the Perfect Benchmarks Programs

IEEE Transactions on Parallel and Distributed Systems
Effects of Loop Fusion and Statement Migration on the Speedup of Vector Multiprocessors

PACT '94 Proceedings of the IFIP WG10.3 Working Conference on Parallel Architectures and Compilation Techniques

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper we present a new method for extracting the maximum parallelism or vector operations out of DO loops with tight recurrences using sequential programming languages. We have named the method Graph Traverse Scheduling (GTS). It is devised to produce code for shared memory multiprocessors or vector machines. When parallelizing, hardware support for fast synchronization is assumed.The method is presented for single nested loops including one or several recurrences and we show how parallel and vector code is generated. Based on the dependence graph of a loop, we first evaluate its parallelism and vector length of statements. Then we apply GTS to distribute loop iterations between tasks or to generate vector operations of a given length. When this method is applied for parallel code generation, dependencies not included in the sequential execution of each task must be explicitly synchronized. A method to minimize the number of explicit synchronizations is also presented. We also present how to compute the synchronization-free parallelism obtaining fully independent tasks.When GTS is applied for vector code generation, a sequential loop of vector operations is obtained.