MaRS: a macro-pipelined reconfigurable system
Proceedings of the 1st conference on Computing frontiers
IPDPS '05 Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Workshop 3 - Volume 04
An Integrated Memory Array Processor Architecture for Embedded Image Recognition Systems
Proceedings of the 32nd annual international symposium on Computer Architecture
An Integrated Memory Array Processor for Embedded Image Recognition Systems
IEEE Transactions on Computers
Mapping of the FFT on a reconfigurable architecture targeted to SDR applications
SOC'09 Proceedings of the 11th international conference on System-on-chip
A route system based on ant colony for coarse-grain reconfigurable architecture
ICNC'06 Proceedings of the Second international conference on Advances in Natural Computation - Volume Part II
Fast parallel FFT on CTaiJi: a coarse-grained reconfigurable computation platform
ISPA'05 Proceedings of the Third international conference on Parallel and Distributed Processing and Applications
Implementation of FFT on General-Purpose Architectures for FPGA
International Journal of Embedded and Real-Time Communication Systems
Hi-index | 0.00 |
This paper presents implementation of a very fast parallel complex FFT on M2, the second generation of MorphoSys Reconfigurable computation platform, which is targeting on streamed applications such as multimedia and DSP. The proposed mapping comprises fast presorting, cascaded radix-2 stages, and post-reordering. Data and twiddle factors are 16-bit real and 16-bit imaginary in 2's complement format and scaling is performed to avoid overflow. The mapping is tested on our cycle-accurate simulator, "Mulate", and the performance is encouragingly better than other architectures such as Imagine and VIRAM. Moreover, the performance is scalable according to FFT sizes. Since there is no functionality specifically tailored to FFT, the results demonstrate the capability of MorphoSys architecture to extract parallelism from streamed applications. Further rationales are given based on the concepts of scalar operand networks and memory hierarchy.