A New Approach to Pipeline FFT Processor
IPPS '96 Proceedings of the 10th International Parallel Processing Symposium
Pipeline and Parallel-Pipeline FFT Processors for VLSI Implementations
IEEE Transactions on Computers
Implementations and Optimizations of Pipeline FFTs on Xilinx FPGAs
RECONFIG '08 Proceedings of the 2008 International Conference on Reconfigurable Computing and FPGAs
Efficient resource sharing architecture for multistandard communication system
VLSI Design - Special issue on CAD for Gigascale SoC Design and Verification Solutions
ICA3PP'12 Proceedings of the 12th international conference on Algorithms and Architectures for Parallel Processing - Volume Part I
On a wideband fast fourier transform for a radio telescope
ACM SIGARCH Computer Architecture News - ACM SIGARCH Computer Architecture News/HEART '12
Hi-index | 0.00 |
This paper presents optimized implementations of two different pipeline FFT processors on Xilinx Spartan-3 and Virtex-4 FPGAs. Different optimization techniques and rounding schemes were explored. The implementation results achieved better performance with lower resource usage than prior art. The 16-bit 1024-point FFT with the R22SDF architecture had a maximum clock frequency of 95.2 MHz and used 2802 slices on the Spartan-3, a throughput per area ratio of 0.034 Msamples/s/slice. The R4SDC architecture ran at 123.8 MHz and used 4409 slices on the Spartan-3, a throughput per area ratio of 0.028 Msamples/s/slice. On Virtex-4, the 16-bit 1024-point R22SDF architecture ran at 235.6 MHz and used 2256 slice, giving a 0.104 Msamples/s/slice ratio; the 16-bit 1024-point R4SDC architecture ran at 219.2 MHz and used 3064 slices, giving a 0.072 Msamples/s/slice ratio. The R22SDF was more efficient than the R4SDC in terms of throughput per area due to a simpler controller and an easier balanced rounding scheme. This paper also shows that balanced stage rounding is an appropriate rounding scheme for pipeline FFT processors.