How to Write Fast Numerical Code: A Small Introduction
Generative and Transformational Techniques in Software Engineering II
Computation reuse in domain-specific optimization of signal recognition
Proceedings of the ACM/SIGDA international symposium on Field programmable gate arrays
Domain-Specific Optimization of Signal Recognition Targeting FPGAs
ACM Transactions on Reconfigurable Technology and Systems (TRETS)
Automatic IP generation of FFT/IFFT processors with word-length optimization for MIMO-OFDM systems
EURASIP Journal on Advances in Signal Processing - Special issue on quantization of VLSI digital signal processing systems
FPGA Architecture for 2D Discrete Fourier Transform Based on 2D Decomposition for Large-sized Data
Journal of Signal Processing Systems
Avoiding game over: bringing design to the next level
Proceedings of the 49th Annual Design Automation Conference
Hi-index | 0.00 |
We present a domain-specific approach to generate high-performance hardware-software partitioned implementations of the discrete Fourier transform (DFT) in fixed point precision. The partitioning strategy is a heuristic based on the DFT's divide-and-conquer algorithmic structure and fine tuned by the feedback-driven exploration of candidate designs. We have integrated this approach in the Spiral linear-transform code-generation framework to support push-button automatic implementation. We present evaluations of hardware-software DFT implementations running on the embedded PowerPC processor and the reconfigurable fabric of the Xilinx Virtex-II Pro FPGA. In our experiments, the 1D and 2D DFT's FPGA-accelerated libraries exhibit between 2 and 7.5 times higher performance (operations per second) and up to 2.5 times better energy efficiency (operations per Joule) than the software-only version.