Efficient control generation for mapping nested loop programs onto processor arrays

  • Authors:
  • Hritam Dutta;Frank Hannig;Holger Ruckdeschel;Jürgen Teich

  • Affiliations:
  • Department of Computer Science 12, Hardware-Software-Co-Design, University of Erlangen-Nuremberg, Am Weichselgarten 3, 91058 Erlangen, Bayern, Germany;Department of Computer Science 12, Hardware-Software-Co-Design, University of Erlangen-Nuremberg, Am Weichselgarten 3, 91058 Erlangen, Bayern, Germany;Department of Computer Science 12, Hardware-Software-Co-Design, University of Erlangen-Nuremberg, Am Weichselgarten 3, 91058 Erlangen, Bayern, Germany;Department of Computer Science 12, Hardware-Software-Co-Design, University of Erlangen-Nuremberg, Am Weichselgarten 3, 91058 Erlangen, Bayern, Germany

  • Venue:
  • Journal of Systems Architecture: the EUROMICRO Journal
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

Processor array architectures are optimal platforms for computationally intensive applications. Such architectures are characterized by hierarchies of parallelism and memory structures, i.e. processor arrays apart from different levels of cache have a large number of processing elements (PE) where each PE can further contain sub-word parallelism. In order to handle large scale problems, balance local memory requirements with I/O-bandwidth, and use different hierarchies of parallelism and memory, one needs a sophisticated transformation called hierarchical partitioning. Innately the applications are data flow dominant and have almost no control flow, but the application of hierarchical partitioning techniques has the disadvantage of a more complex control flow. In a previous paper, the authors presented first time a methodology for the automated control path synthesis for the mapping of partitioned algorithms onto processor arrays. However, the control path contained complex multiplication and division operators. In this paper, we propose a significant extension to the methodology which reduces the hardware cost of the global controller and memory address generators by avoiding these costly operations.