Compilers: principles, techniques, and tools
Compilers: principles, techniques, and tools
Advanced compiler optimizations for supercomputers
Communications of the ACM - Special issue on parallelism
The warp computer: Architecture, implementation, and performance
IEEE Transactions on Computers
Compiler Optimizations for Enhancing Parallelism and Their Impact on Architecture Design
IEEE Transactions on Computers - Special issue on architectural support for programming languages and operating systems
C3P Proceedings of the third conference on Hypercube concurrent computers and applications: Architecture, software, computer systems, and general issues - Volume 1
Compiler optimizations for asynchronous systolic array programs
POPL '88 Proceedings of the 15th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Process decomposition through locality of reference
PLDI '89 Proceedings of the ACM SIGPLAN 1989 Conference on Programming language design and implementation
A parallelizing compiler for distributed memory parallel computers
A parallelizing compiler for distributed memory parallel computers
Automatic synthesis of systolic arrays from uniform recurrent equations
ISCA '84 Proceedings of the 11th annual international symposium on Computer architecture
Efficient Doacross execution on distributed shared-memory multiprocessors
Proceedings of the 1991 ACM/IEEE conference on Supercomputing
Evaluation of compiler generated parallel programs on three multicomputers
ICS '92 Proceedings of the 6th international conference on Supercomputing
IEEE Transactions on Parallel and Distributed Systems
Compiler optimization of dynamic data distributions for distributed-memory multicomputers
Compiler optimizations for scalable parallel systems
Compiling for Distributed Memory Architectures
IEEE Transactions on Parallel and Distributed Systems
A migration tool to support resource and load sharing in heterogeneous computing environments
Computer Communications
Hi-index | 0.00 |
This paper describes an AL compiler for the Warp systolic array. AL is a programming language in which the user programs a systolic array as if it were a sequential computer and relies on the compiler to generate parallel code. This paper introduces the notion of data relations in compiling programs for systolic arrays. Unlike dependence relations among statements of a program, data relations define compatibility relations among data objects of a program. The AL compiler uses data relations to compute data compatibility classes, determine data distribution, and distribute loop iterations. The AL compiler can generate efficient parallel code almost identical to what the user would have written by hand. For example, the AL compiler generates parallel code for the LINPACK LU decomposition (SGEFA) and QR decomposition (SQRDC) routines with a nearly 8-fold speedup on the 10-cell Warp array for matrices of size 180 × 180.