High throughput DA-based DCT with high accuracy error-compensated adder tree

Authors:
Yuan-Ho Chen;Tsin-Yuan Chang;Chung-Yi Li
Affiliations:
Department of Electrical Engineering, National Tsing Hua University, Hsinchu, Taiwan;Department of Electrical Engineering, National Tsing Hua University, Hsinchu, Taiwan;Department of Electrical Engineering, National Tsing Hua University, Hsinchu, Taiwan
Venue:
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Year:
2011

Citing 8
Cited 1

DCT Implementation with Distributed Arithmetic

IEEE Transactions on Computers
Video Processing and Communications

Video Processing and Communications
Design of low-error fixed-width modified booth multiplier

IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Design and Implementaion of a 2D-DCT Architecture Using Coefficient Distributed Arithmetic

ISVLSI '05 Proceedings of the IEEE Computer Society Annual Symposium on VLSI: New Frontiers in VLSI Design
A Pipelined Fast 2D-DCT Accelerator for FPGA-based SoCs

ISVLSI '07 Proceedings of the IEEE Computer Society Annual Symposium on VLSI
Cost-effective triple-mode reconfigurable pipeline FFT/IFFT/2-D DCT processor

IEEE Transactions on Very Large Scale Integration (VLSI) Systems
NEDA: a low-power high-performance DCT architecture

IEEE Transactions on Signal Processing
New systolic array implementation of the 2-D discrete cosine transform and its inverse

IEEE Transactions on Circuits and Systems for Video Technology

A high performance video transform engine by using space-time scheduling strategy

IEEE Transactions on Very Large Scale Integration (VLSI) Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this brief, by operating the shifting and addition in parallel, an error-compensated adder-tree (ECAT) is proposed to deal with the truncation errors and to achieve low-error and high-throughput discrete cosine transform (DCT) design. Instead of the 12 bits used in previous works, 9-bit distributed arithmetic-precision is chosen for this work so as to meet peak-signal-to-noise-ratio (PSNR) requirements. Thus, an area-efficient DCT core is implemented to achieve 1 Gpels/s throughput rate with gate counts of 22.2 K for the PSNR requirements outlined in the previous works.