An Architecture for Motion Estimation in the Transform Domain
VLSID '04 Proceedings of the 17th International Conference on VLSI Design
An area efficient DCT architecture for MPEG-2 video encoder
IEEE Transactions on Consumer Electronics
The quantized DCT and its application to DCT-based video coding
IEEE Transactions on Image Processing
A cost-effective architecture for 8×8 two-dimensional DCT/IDCT using direct method
IEEE Transactions on Circuits and Systems for Video Technology
A comparison of block-matching algorithms mapped to systolic-array implementation
IEEE Transactions on Circuits and Systems for Video Technology
A simple processor core design for DCT/IDCT
IEEE Transactions on Circuits and Systems for Video Technology
New systolic array implementation of the 2-D discrete cosine transform and its inverse
IEEE Transactions on Circuits and Systems for Video Technology
An efficient VLSI architecture for motion compensation of AVS HDTV decoder
Journal of Computer Science and Technology - Special section on China AVS standard
Efficient video decoding on GPUs by point based rendering
GH '06 Proceedings of the 21st ACM SIGGRAPH/EUROGRAPHICS symposium on Graphics hardware
Hi-index | 0.00 |
In this paper, high performance array processor for signal processing algorithms with high computational complexities is implemented using 0.16 µm CMOS standard cell library. The proposed array processor consists of simple processing elements. The architectural benefits of highly regular, parallel, and pipelined processing elements simplify the design of complex signal processing systems and enable high throughput rate by massive parallel computations. We show the utility of the proposed architecture as a configurable core by mapping inverse discrete cosine transform (IDCT), motion compensation (MC), and inverse quantization (IQ) onto the proposed fabric. In addition, we propose a novel scheme that integrates the inverse quantization part of video decoding into the 2-D IDCT process simplifying computational logics. The results show that a high throughput rate to meet the real-time requirement is effectively achieved by exploiting the properties of both compressed video data statistics and the array processor architecture.