Loop tiling for parallelism
Exploiting Wavefront Parallelism on Large-Scale Shared-Memory Multiprocessors
IEEE Transactions on Parallel and Distributed Systems
CellSs: a programming model for the cell BE architecture
Proceedings of the 2006 ACM/IEEE conference on Supercomputing
Finite Difference Time Domain (FDTD) Simulations Using Graphics Processors
HPCMP-UGC '07 Proceedings of the 2007 DoD High Performance Computing Modernization Program Users Group Conference
Hybrid access-specific software cache techniques for the cell BE architecture
Proceedings of the 17th international conference on Parallel architectures and compilation techniques
Hi-index | 0.00 |
This paper presents a methodology, which enables an optimized execution of FDTD (Finite Difference Time Domain) computations in a streaming model architecture (Cell BE processors). A data flow graph that represents the FDTD computations in an irregular wave propagation area is transformed into a set of tiles. Each tile represents regular computations for a small part of a given computational area. Tiles are injected into computational nodes and processed in a pipe-like manner. It will be shown that such approach enables solving the FDTD problem with a speedup almost equal to the ideal one. Several computation optimization methods are presented. Efficiency of streaming computations for various simulation parameters is discussed. Experimental results obtained for the streaming model on a physical PS3 machine are presented as well.