High Performance Array Processor for Video Decoding

Authors:
J. Lee;N. Vijaykrishnan;M. J. Irwin
Affiliations:
Pennsylvania State University;Pennsylvania State University;Pennsylvania State University
Venue:
ISVLSI '05 Proceedings of the IEEE Computer Society Annual Symposium on VLSI: New Frontiers in VLSI Design
Year:
2005

Citing 7
Cited 2

An Architecture for Motion Estimation in the Transform Domain

VLSID '04 Proceedings of the 17th International Conference on VLSI Design
An area efficient DCT architecture for MPEG-2 video encoder

IEEE Transactions on Consumer Electronics
The quantized DCT and its application to DCT-based video coding

IEEE Transactions on Image Processing
A cost-effective architecture for 8×8 two-dimensional DCT/IDCT using direct method

IEEE Transactions on Circuits and Systems for Video Technology
A comparison of block-matching algorithms mapped to systolic-array implementation

IEEE Transactions on Circuits and Systems for Video Technology
A simple processor core design for DCT/IDCT

IEEE Transactions on Circuits and Systems for Video Technology
New systolic array implementation of the 2-D discrete cosine transform and its inverse

IEEE Transactions on Circuits and Systems for Video Technology

An efficient VLSI architecture for motion compensation of AVS HDTV decoder

Journal of Computer Science and Technology - Special section on China AVS standard
Efficient video decoding on GPUs by point based rendering

GH '06 Proceedings of the 21st ACM SIGGRAPH/EUROGRAPHICS symposium on Graphics hardware

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper, high performance array processor for signal processing algorithms with high computational complexities is implemented using 0.16 µm CMOS standard cell library. The proposed array processor consists of simple processing elements. The architectural benefits of highly regular, parallel, and pipelined processing elements simplify the design of complex signal processing systems and enable high throughput rate by massive parallel computations. We show the utility of the proposed architecture as a configurable core by mapping inverse discrete cosine transform (IDCT), motion compensation (MC), and inverse quantization (IQ) onto the proposed fabric. In addition, we propose a novel scheme that integrates the inverse quantization part of video decoding into the 2-D IDCT process simplifying computational logics. The results show that a high throughput rate to meet the real-time requirement is effectively achieved by exploiting the properties of both compressed video data statistics and the array processor architecture.