Space-time scheduling of instruction-level parallelism on a raw machine
Proceedings of the eighth international conference on Architectural support for programming languages and operating systems
Parallelizing Applications into Silicon
FCCM '99 Proceedings of the Seventh Annual IEEE Symposium on Field-Programmable Custom Computing Machines
Hi-index | 0.00 |
Microprocessors of the next decade and beyond will be built using VLSI chips employing billions of transistors. In this generation of microprocessors, achieving a high level of parallelism at a reasonable clock speed will require full distribution of machine resources. Raw architectures explore this architectural space by distributing all their processor and memory resources as a 2-D mesh of simple tiles. To provide a simple sequential programming model, a Raw architecture exposes the hardware fully and relies on the compiler or the software run-time system to achieve efficient execution while maintaining the semantics of a single instruction stream and a unified memory system. Supporting a single view of memory on top of a distributed memory architecture presents a challenging compiler problem. This paper presents a compiler system called Maps that supports a unified view of memory on a Raw architecture. Maps relies on two inter-tile interconnects: a fast, statically schedulable network and a slower dynamic network. Maps attempts to schedule the memory accesses for maximum parallelism and speed while enforcing proper dependence. It optimizes for speed in two ways: by finding accesses that can be scheduled on the static interconnect through a process called {\it static promotion}, and by minimizing dependence sequentialization for the remaining accesses. Static accesses are discovered through applications of traditional pointer and array analysis, and a new technique called modulo unrolling. Maps enforces proper dependence through a combination of explicit synchronization and a technique called software serial ordering. We have implemented Maps based on the SUIF infrastructure. This paper presents preliminary results based on compiling several programs using Maps.