Architectural support for the efficient generation of code for horizontal architectures

  • Authors:
  • B. R. Rau;C. D. Glaeser;E. M. Greenawalt

  • Affiliations:
  • Advanced Processor Technology Laboratory, ESL Inc., San Jose, California;Advanced Processor Technology Laboratory, ESL Inc., San Jose, California;Advanced Processor Technology Laboratory, ESL Inc., San Jose, California

  • Venue:
  • ASPLOS I Proceedings of the first international symposium on Architectural support for programming languages and operating systems
  • Year:
  • 1982

Quantified Score

Hi-index 0.00

Visualization

Abstract

Horizontal architectures, such as the CDC Advanced Flexible Processor [I] and the FPS APi20-B [2}, consist of a number of resources that can operate in parallel, each of which is controlled by a field in the wide instruction word. Such architectures have been developed to perform high speed scientific computations at a modest cost: Figure 1 displays those characteristics of horizontal architectures that are germane to the issues discussed in this paper. The simultaneous requirements of high performance and low cost lead to an architecture consisting of multiple pipelined processing elements (PEs) such as adders and multipliers, a memory (which for scheduling purposes may be viewed as yet another PE with two operations: a READ and a WRITE), and an interconnect which ties them all together. The interconnect allows the result of one operation to be directly routed to another PE as one of the inputs for an operation that is to be performed there. The required memory bandwidth is reduced since temporary values need not be written to and read from the memory. The final aspect of horizontal processors that is of interest is that their program memories emit wide instructions which synchronously specify the actions of the multiple and possibly dissimilar PEs. The program memory is sequenced by a conventional sequencer that assumes sequential flow of control unless a branch is explicitly specified. As a consequence of the simplicity of such an architecture, it is inexpensive relative to the potential performance of the multiple pipelined PEs. However, if this potential performance is to be realized, the multiple resources of a horizontal processor must be scheduled effectively. The scheduling task for conventional horizontal processors is quite complex and the construction of highly optimizing compilers for them is a difficult and expensiw3 project. The polycyclic architecture [3- 6] is a horizontal architecture with architectural support for the scheduling task. The cause of the complexity involved in scheduling conventional horizontal processors and the manner in which the polycyclic architecture addresses this issue are outlined in this paper.