FlexWAFE - a high-end real-time stream processing library for FPGAs
Proceedings of the 44th annual Design Automation Conference
Accelerating advanced MRI reconstructions on GPUs
Journal of Parallel and Distributed Computing
Exploring parallelization strategies for NUFFT data translation
EMSOFT '09 Proceedings of the seventh ACM international conference on Embedded software
Accelerating the Nonuniform Fast Fourier Transform Using FPGAs
FCCM '10 Proceedings of the 2010 18th IEEE Annual International Symposium on Field-Programmable Custom Computing Machines
BLAS Comparison on FPGA, CPU and GPU
ISVLSI '10 Proceedings of the 2010 IEEE Annual Symposium on VLSI
A special-purpose compiler for look-up table and code generation for function evaluation
Proceedings of the Conference on Design, Automation and Test in Europe
Towards hardware stereoscopic 3D reconstruction: a real-time FPGA computation of the disparity map
Proceedings of the Conference on Design, Automation and Test in Europe
Local Interpolation-based Polar Format SAR: Algorithm, Hardware Implementation and Design Automation
Journal of Signal Processing Systems
Hi-index | 0.00 |
Gridding is a method of interpolating irregularly sampled data on to a uniform grid and is a critical image reconstruction step in several applications which operate on non-Cartesian sampled data. In this paper, we present an algorithm architecture co-design framework for accelerating gridding using FPGAs. We present a parameterized hardware library for accelerating gridding to support both arbitrary and regular trajectories. We further describe our kernel automation framework which supports several kernel functions through look-up-table (LUT) based Taylor polynomial evaluation. This framework is integrated using an in-house multi-FPGA development platform which provides hardware infrastructure for integrating custom accelerators. Design-space exploration is enabled by an automation flow which allows system generation from an algorithm specification. We further provide several case studies by realizing systems for nonuniform fast Fourier transform (NuFFT) with different parameter sets and porting them on to the BEE3 platform. Results show speedups of more than 16X and 2X over existing CPU and FPGA implementations respectively, and up to 5.5 times higher performance-per-watt over a comparable GPU implementation.