Efficient VLSI architectures for fast computation of the discreteFourier transform and its inverse

Authors:
Ching-Hsien Chang;Chin-Liang Wang;Yu-Tai Chang
Affiliations:
Dept. of Electr. Eng., Nat. Tsing Hua Univ., Hsinchu;-;-
Venue:
IEEE Transactions on Signal Processing
Year:
2000

Citing 0
Cited 8

Hardware Efficient Fast Computation of the Discrete Fourier Transform

Journal of VLSI Signal Processing Systems
VLSI implementation of programmable FFT architectures for OFDM communication system

Proceedings of the 2006 international conference on Wireless communications and mobile computing
Calculation scheme based on a weighted primitive: application to image processing transforms

EURASIP Journal on Applied Signal Processing
A reconfigurable systolic array architecture for multicarrier wireless and multirate applications

International Journal of Reconfigurable Computing
Pipeline architectures for radix-2 new Mersenne number transform

IEEE Transactions on Circuits and Systems Part I: Regular Papers - Special section on 2008 custom integrated circuits conference (CICC 2008)
Efficient Systolic Designs for 1- and 2-Dimensional DFT of General Transform-Lengths for High-Speed Wireless Communication Applications

Journal of Signal Processing Systems
Improvement of image transform calculation based on a weighted primitive

ICIAR'06 Proceedings of the Third international conference on Image Analysis and Recognition - Volume Part I
A high performance video transform engine by using space-time scheduling strategy

IEEE Transactions on Very Large Scale Integration (VLSI) Systems

Quantified Score

Hi-index	35.68

Visualization

Abstract

In this paper, we propose two new VLSI architectures for computing the N-point discrete Fourier transform (DFT) and its inverse (IDFT) based on a radix-2 fast algorithm, where N is a power of two. The first part of this work presents a linear systolic array that requires log2 N complex multipliers and is able to provide a throughput of one transform sample per clock cycle. Compared with other related systolic designs based on direct computation or a radix-2 fast algorithm, the proposed one has the same throughput performance but involves less hardware complexity. This design is suitable for high-speed real-time applications, but it would not be easily realized in a single chip when N gets large. To balance the chip area and the processing speed, we further present a new reduced-complexity design for the DFT/IDFT computation. The alternative design is a memory-based architecture that consists of one complex multiplier, two complex adders, and some special memory units. The new design has the capability of computing one transform sample every log2 N+1 clock cycles on average. In comparison with the first design, the second design reaches a lower throughput with less hardware complexity. As N=512, the chip area required for the memory-based design is about 5742×5222 μm2, and the corresponding throughput can attain a rate as high as 4M transform samples per second under 0.6 μm CMOS technology. Such area-time performance makes this design very competitive for use in long-length DFT applications, such as asymmetric digital subscriber lines (ADSL) and orthogonal frequency-division multiplexing (OFDM) systems