A Comparison Study on Implementing Optical Flow and Digital Communications on FPGAs and GPUs
ACM Transactions on Reconfigurable Technology and Systems (TRETS)
A performance and energy comparison of FPGAs, GPUs, and multicores for sliding-window applications
Proceedings of the ACM/SIGDA international symposium on Field Programmable Gate Arrays
Pipelined architecture for real-time cost-optimized extraction of visual primitives based on FPGAs
Digital Signal Processing
Proceedings of the ACM/SIGDA international symposium on Field programmable gate arrays
Parallel architecture for hierarchical optical flow estimation based on FPGA
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
UWB microwave imaging for breast cancer detection: Many-core, GPU, or FPGA?
ACM Transactions on Embedded Computing Systems (TECS) - Special Issue on Design Challenges for Many-Core Processors, Special Section on ESTIMedia'13 and Regular Papers
Journal of Real-Time Image Processing
Hi-index | 0.00 |
FPGA devices have often found use as higher-performance alternatives to programmable processors for implementing a variety of computations. Applications successfully implemented on FPGAs have typically contained high levels of parallelism and have often used simple statically-scheduled control and modest arithmetic. Recently introduced computing devices such as coarse grain reconfigurable arrays, multi-core processors, and graphical processing units (GPUs) promise to significantly change the computational landscape for the implementation of high-speed real-time computing tasks. One reason for this is that these architectures take advantage of many of the same application characteristics that fit well on FPGAs. One real-time computing task, optical flow, is difficult to apply in robotic vision applications in practice because of its high computational and data rate requirements, and so is a good candidate for implementation on FPGAs and other custom computing architectures. In this paper, a tensor-based optical flow algorithm is implemented on both an FPGA and a GPU and the two implementations discussed. The two implementations had similar performance, but with the FPGA implementation requiring 12x more development time. Other comparison data for these two technologies is then given for three additional applications taken from a MIMO digital communication system design, providing additional examples of the relative capabilities of these two technologies.