Matched Filter Computation on FPGA, Cell and GPU

Authors:
Zachary K. Baker;Maya B. Gokhale;Justin L. Tripp
Affiliations:
Los Alamos National Laboratory, USA;Los Alamos National Laboratory, USA;Los Alamos National Laboratory, USA
Venue:
FCCM '07 Proceedings of the 15th Annual IEEE Symposium on Field-Programmable Custom Computing Machines
Year:
2007

Citing 0
Cited 9

A Comparison Study on Implementing Optical Flow and Digital Communications on FPGAs and GPUs

ACM Transactions on Reconfigurable Technology and Systems (TRETS)
State-of-the-art in heterogeneous computing

Scientific Programming
A two-level real-time vision machine combining coarse- and fine-grained parallelism

Journal of Real-Time Image Processing
Accelerating the Explicitly Restarted Arnoldi Method with GPUs Using an Autotuned Matrix Vector Product

SIAM Journal on Scientific Computing
A performance and energy comparison of FPGAs, GPUs, and multicores for sliding-window applications

Proceedings of the ACM/SIGDA international symposium on Field Programmable Gate Arrays
A low-overhead interconnect architecture for virtual reconfigurable fabrics

Proceedings of the 2012 international conference on Compilers, architectures and synthesis for embedded systems
A performance and energy comparison of convolution on GPUs, FPGAs, and multicore processors

ACM Transactions on Architecture and Code Optimization (TACO) - Special Issue on High-Performance Embedded Architectures and Compilers
A high-performance, low-energy FPGA accelerator for correntropy-based feature tracking (abstract only)

Proceedings of the ACM/SIGDA international symposium on Field programmable gate arrays
A framework for comparing high performance computing technologies

International Journal of Computational Science and Engineering

Quantified Score

Hi-index	0.00

Visualization

Abstract

The matched filter is an important kernel in the processing of hyperspectral data. The filter enables researchers to sift useful data from instruments that span large frequency bands and can produce Gigabytes of data in seconds. In this work, we evaluate the performance of a matched filter algorithm implementation on an FPGA-accelerated co-processor (Cray XD-1), the IBM Cell microprocessor, and the NVIDIA GeForce 7900 GTX GPU graphics card. We provide extensive discussion of the challenges and opportunities afforded by each platform. In particular, we explore the problems of partitioning the filter most efficiently between the host CPU and the co-processor. Using our results, we derive several performance metrics that provide the optimal solution for a variety of application situations.