Design and Implementaion of a 2D-DCT Architecture Using Coefficient Distributed Arithmetic
ISVLSI '05 Proceedings of the IEEE Computer Society Annual Symposium on VLSI: New Frontiers in VLSI Design
Energy-efficient Hardware Accelerators for the SA-DCT and Its Inverse
Journal of VLSI Signal Processing Systems
A 252Kgates/4.9Kbytes SRAM/71mW multistandard video decoder for high definition video applications
ACM Transactions on Design Automation of Electronic Systems (TODAES)
VLSI implementation of a configurable IP Core for quantized discrete cosine and integer transforms
International Journal of Circuit Theory and Applications
A high performance video transform engine by using space-time scheduling strategy
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Hi-index | 0.00 |
This paper proposes an efficient two-dimensional (2-D) discrete cosine and inverse discrete cosine transform (DCT/IDCT) core design. Adopting the row-column decomposition technique for computing 2-D DCT/IDCT, we formulate the one-dimensional (1-D) DCT/IDCT into cyclic convolution by properly arranging the input sequence, optimize the multiplications based on the concept of common subexpression sharing, and carry out the multiplications through carry-save adders (CSAs). Using cyclic convolution is helpful in exploiting the word-level data sharing in computing different DCT/IDCT outputs. Adopting the common subexpression sharing is beneficial to the bit-level data sharing in computing the outputs. As compared with some existing approaches of realizing DCT/IDCT, the proposed approach can save on average 20%∼33% in the delay-area product (gate-count * time-unit) based on a 0.35-μm CMOS technology under the data word-lengths ranging from 16∼24 b. Besides, we have also proposed an IP generator for designing the 2-D DCT/IDCT based on the proposed approach. It provides a design-automation environment with parameter configurations in designing a 2-D DCT/IDCT core that is suitable for most image and video compression applications.