Eliminating the memory bottleneck: an FPGA-based solution for 3d reverse time migration

Authors:
Haohuan Fu;Robert G. Clapp
Affiliations:
Tsinghua University, Beijing, China;Stanford University, Stanford, CA, USA
Venue:
Proceedings of the 19th ACM/SIGDA international symposium on Field programmable gate arrays
Year:
2011

Citing 10
Cited 1

Tiling optimizations for 3D scientific computations

Proceedings of the 2000 ACM/IEEE conference on Supercomputing
An FPGA implementation of the two-dimensional finite-difference time-domain (FDTD) algorithm

FPGA '04 Proceedings of the 2004 ACM/SIGDA 12th international symposium on Field programmable gate arrays
Reflections on the memory wall

Proceedings of the 1st conference on Computing frontiers
FPGA-Based Acceleration of the 3D Finite-Difference Time-Domain Method

FCCM '04 Proceedings of the 12th Annual IEEE Symposium on Field-Programmable Custom Computing Machines
Time Domain Numerical Simulation for Transient Waves on Reconfigurable Coprocessor Platform

FCCM '05 Proceedings of the 13th Annual IEEE Symposium on Field-Programmable Custom Computing Machines
An Efficient Implementation of High-Accuracy Finite Difference Computing Engine on FPGAs

ASAP '06 Proceedings of the IEEE 17th International Conference on Application-specific Systems, Architectures and Processors
Stencil computation optimization and auto-tuning on state-of-the-art multicore architectures

Proceedings of the 2008 ACM/IEEE conference on Supercomputing
Finding Speedup in Parallel Processors

ISPDC '08 Proceedings of the 2008 International Symposium on Parallel and Distributed Computing
A perfectly matched layer for the absorption of electromagnetic waves

Journal of Computational Physics
Debunking the 100X GPU vs. CPU myth: an evaluation of throughput computing on CPU and GPU

Proceedings of the 37th annual international symposium on Computer architecture

Revisiting finite difference and spectral migration methods on diverse parallel architectures

Computers & Geosciences

Quantified Score

Hi-index	0.00

Visualization

Abstract

Memory-related constraints (memory bandwidth, cache size) are nowadays the performance bottleneck of most computational applications. Especially in the scenario of multiple cores, the performance does not scale with the number of cores in many cases. In our work, we present our FPGA-based solution for the 3D Reverse Time Migration (RTM) algorithm. As the most computationally demanding imaging algorithm in current oil and gas exploration, RTM involves various computational challenges, such as a high demand for storage size and bandwidth, and a poor cache behavior. Combining optimizations from both the algorithmic and architectural perspectives, our FPGA-based solution manages to remove the memory constraints and provide a high performance that can scale well with the amount of computational resources available. Compared with an optimized CPU implementation using two quad-core Intel Nehalem CPUs, our solution achieves 4x speedup on two Virtex-5 FPGAs, and 8x speedup on two Virtex-6 FPGAs. Our projection demonstrates that the performance will continue to scale with the future increase of FPGA capacities.