ASPLOS IV Proceedings of the fourth international conference on Architectural support for programming languages and operating systems
ACM Computing Surveys (CSUR)
Tiling optimizations for 3D scientific computations
Proceedings of the 2000 ACM/IEEE conference on Supercomputing
Time Skewing for Parallel Computers
LCPC '99 Proceedings of the 12th International Workshop on Languages and Compilers for Parallel Computing
Cache oblivious stencil computations
Proceedings of the 19th annual international conference on Supercomputing
Impact of modern memory subsystems on cache optimizations for stencil computations
Proceedings of the 2005 workshop on Memory system performance
Implicit and explicit optimizations for stencil computations
Proceedings of the 2006 workshop on Memory system performance and correctness
Stencil computation optimization and auto-tuning on state-of-the-art multicore architectures
Proceedings of the 2008 ACM/IEEE conference on Supercomputing
Scientific Programming - High Performance Computing with the Cell Broadband Engine
Streaming model computation of the FDTD problem
PARA'10 Proceedings of the 10th international conference on Applied Parallel and Scientific Computing - Volume Part I
Hi-index | 0.00 |
Finite Difference (FD) is a widely used method to solve Partial Differential Equations (PDE). PDEs are the core of many simulations in different scientific fields, e.g. geophysics, astrophysics, etc. The typical FD solver performs stencil computations for the entire 3D domain, thus solving the differential operator. This computation consists on accumulating the contribution of the neighbor points along the cartesian axis. It is performance-bound by two main problems: the memory access pattern and the inefficient re-utilization of the data. We propose a novel algorithm, named "semi-stencil", that tackle those two problems. Our first target architecture for testing is Cell/B.E., where the implementation reaches 12.4 GFlops (49% peak performance) per SPE, while the classical stencil computation only reaches 34%. Further, we successfully apply this code optimization to an industrial-strength application (Reverse-Time Migration). These results show that semi-stencil is useful stencil computation optimization.