Cost-effective triple-mode reconfigurable pipeline FFT/IFFT/2-D DCT processor

Authors:
Chin-Teng Lin;Yuan-Chu Yu;Lan-Da Van
Affiliations:
University Provost and Electrical and Control Engineering, National Chiao-Tung University, Hsinchu, Taiwan, R.O.C.;Department of Electrical and Control Engineering, National Chiao-Tung University, Hsinchu, Taiwan, R.O.C.;Department of Computer Science, National Chiao-Tung University, Hsinchu, Taiwan, R.O.C.
Venue:
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Year:
2008

Citing 12
Cited 3

Reconfigurable Processor Architectures for Mobile Phones

IPDPS '03 Proceedings of the 17th International Symposium on Parallel and Distributed Processing
A 333-MHz dual-MAC DSP architecture for next-generation wireless applications

ICASSP '01 Proceedings of the Acoustics, Speech, and Signal Processing, 200. on IEEE International Conference - Volume 02
Architecture for Variable-Length Combined FFT, DCT, and MWT Transform Hardware for a Multi-ModeWireless System

VLSID '07 Proceedings of the 20th International Conference on VLSI Design held jointly with 6th International Conference: Embedded Systems
A new IDCT-DFT relationship reducing the IDCT computational cost

IEEE Transactions on Signal Processing
High-speed and low-power split-radix FFT

IEEE Transactions on Signal Processing
Fast multiplication-free QWDCT for DV coding standard

IEEE Transactions on Consumer Electronics
Streaming processors for next-generation mobile imaging applications

IEEE Communications Magazine
A cost-effective architecture for 8×8 two-dimensional DCT/IDCT using direct method

IEEE Transactions on Circuits and Systems for Video Technology
A new hardware-efficient algorithm and architecture for computation of 2-D DCTs on a linear array

IEEE Transactions on Circuits and Systems for Video Technology
A method of estimating coding PSNR using quantized DCT coefficients

IEEE Transactions on Circuits and Systems for Video Technology
New systolic array implementation of the 2-D discrete cosine transform and its inverse

IEEE Transactions on Circuits and Systems for Video Technology
A 100 MHz 2-D 8×8 DCT/IDCT processor for HDTV applications

IEEE Transactions on Circuits and Systems for Video Technology

High level modeling and automated generation of heterogeneous SoC architectures with optimized custom reconfigurable cores and on-chip communication media

Journal of Systems Architecture: the EUROMICRO Journal
High throughput DA-based DCT with high accuracy error-compensated adder tree

IEEE Transactions on Very Large Scale Integration (VLSI) Systems
A high performance video transform engine by using space-time scheduling strategy

IEEE Transactions on Very Large Scale Integration (VLSI) Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

This investigation proposes a novel radix-42 algorithm with the low computational complexity of a radix-16 algorithm but the lower hardware requirement of a radix-4 algorithm. The proposed pipeline radix-42 single delay feedback path (R42SDF) architecture adopts a multiplierless radix-4 butterfly structure, based on the specific linear mapping of common factor algorithm (CFA), to support both 256-point fast Fourier transform/inverse fast Fourier transform (FFT/IFFT) and 8 × 8 2-D discrete cosine transform (DCT) modes following with the high efficient feedback shift registers architecture. The segment shift register (SSR) and overturn shift register (OSR) structure are adopted to minimize the register cost for the input reordering and post computation operations in the 8 × 8 2-D DCT mode, respectively. Moreover, the retrenched constant multiplier and eight-folded complex multiplier structures are adopted to decrease the multiplier cost and the coefficient ROM size with the complex conjugate symmetry rule and subexpression elimination technology. To further decrease the chip cost, a finite wordlength analysis is provided to indicate that the proposed architecture only requires a 13-bit internal wordlength to achieve 40-dB signal-to-noise ratio (SNR) performance in 256-point FFT/IFFT modes and high digital video (DV) compression quality in 8 × 8 2-D DCT mode. The comprehensive comparison results indicate that the proposed cost effective reconfigurable design has the smallest hardware requirement and largest hardware utilization among the tested architectures for the FFT/IFFT computation, and thus has the highest cost efficiency. The derivation and chip implementation results show that the proposed pipeline 256-point FFT/IFFT/2-D DCT triple-mode chip consumes 22.37 mW at 100 MHz at 1.2-V supply voltage in TSMC 0.13-µm CMOS process, which is very appropriate for the RSoCs IP of next-generation handheld devices.