Mapping the FDTD Application to Many-Core Chip Architectures

  • Authors:
  • Daniel A. Orozco;Guang R. Gao

  • Affiliations:
  • -;-

  • Venue:
  • ICPP '09 Proceedings of the 2009 International Conference on Parallel Processing
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper reports a study of mapping the Finite Difference Time Domain (FDTD) application to the IBM Cyclops-64 (C64) many-core chip architecture [1]. C64 is chosen for this study as it represents the current trend in computer architecture to develop a class of many-core architectures with distinct features e.g. software manageable on-chip memory hierarchy (vs. a hardware-managed data cache), high on-chip bandwidth, fine grain multithreading and synchronization, among others. Major results of our study include: 1. A good mapping of FDTD can effectively exploit the on-chip parallelism of C64-like architectures and show good performance and scalability. 2. Such performance improvement is derived by employing a number of code optimization techniques such as time skewing and split tiling that judiciously exploit the architecture features described in (1). 3. High performance requires maximum reuse of on-chip memory, which is obtained by tiling with non conventional tile shapes. 4. Such code optimization techniques we used in (2) and tiling such as the one used in (3) should be implementable within a reasonable compilation framework, opening a new set of possibilities for compiler optimizations.