Energy-efficient Hardware Accelerators for the SA-DCT and Its Inverse
Journal of VLSI Signal Processing Systems
LUT optimization for memory-based computation
IEEE Transactions on Circuits and Systems II: Express Briefs
New approach to look-up-table design and memory-based realization of FIR digital filter
IEEE Transactions on Circuits and Systems Part I: Regular Papers
Hi-index | 0.00 |
This paper presents a memory-efficient approach to realize the cyclic convolution and its application to the discrete cosine transform (DCT). We adopt the way of distributed arithmetic (DA) computation, exploit the symmetry property of DCT coefficients to merge the elements in the matrix of DCT kernel, separate the kernel to be two perfect cyclic forms, and partition the content of ROM into groups to facilitate an efficient realization of a one-dimensional (1-D) N-point DCT kernel using (N-1)/2 adders or subtractors, one small ROM module, a barrel shifter, and ((N-1)/2)+1 accumulators. The proposed memory-efficient design technique is characterized by rearranging the content of the ROM using the conventional DA approach into several groups such that all the elements in a group can be accessed simultaneously in accumulating all the DCT outputs for increasing the ROM utilization. Considering an example using 16-bit coefficients, the proposed design can save more than 57% of the delay-area product, as compare with the existing DA-based designs in the case of the 1-D seven-point DCT. Finally, a 1-D DCT chip was implemented to illustrate the efficiency associated with the proposed approach.