Reducing control overhead in dataflow architectures

  • Authors:
  • Andrew Petersen;Andrew Putnam;Martha Mercaldi;Andrew Schwerin;Susan Eggers;Steve Swanson;Mark Oskin

  • Affiliations:
  • University of Washington, Seattle, WA;University of Washington, Seattle, WA;University of Washington, Seattle, WA;University of Washington, Seattle, WA;University of Washington, Seattle, WA;University of Washington, Seattle, WA;University of Washington, Seattle, WA

  • Venue:
  • Proceedings of the 15th international conference on Parallel architectures and compilation techniques
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

In recent years, computer architects have proposed tiled architectures in response to several emerging problems in processor design, such as design complexity, wire delay, and fabrication reliability. One of these architectures, WaveScalar, uses a dynamic, tagged-token dataflow execution model to simplify the design of the processor tiles and their interconnection network and to achieve good parallel performance. However, using a dataflow execution model reawakens old problems, including the instruction overhead required for control flow. Previous work compiling the functional language Id to the Monsoon Dataflow System found this overhead to be 2–3× that of programs written in C and targeted to a MIPS R3000.In this paper, we present and analyze three compiler optimizations that significantly reduce control overhead with minimal additional hardware. We begin by describing how to translate imperative code into dataflow assembly and analyze the resulting control overhead. We report a similar 2–4× instruction overhead, which suggests that the execution model, rather than a specific source language or target architecture, is responsible. Then, we present the compiler optimizations, each of which is designed to eliminate a particular type of control overhead, and analyze the extent to which they were able to do so. Finally, we evaluate the effect using all optimizations together has on program performance. Together, the optimizations reduce control overhead by 80% on average, increasing application performance between 21–37%.