A memory-efficient and highly parallel architecture for variable block size integer motion estimation in H.264/AVC

Authors:
Chao-Yang Kao;Youn-Long Lin
Affiliations:
Department of Computer Science, National Tsing Hua University, HsinChu, Taiwan;Department of Computer Science, National Tsing Hua University, HsinChu, Taiwan
Venue:
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Year:
2010

Citing 8
Cited 3

Multi-Frame Motion-Compensated Prediction for Video Transmission

Multi-Frame Motion-Compensated Prediction for Video Transmission
A fast VLSI architecture for full-search variable block size motion estimation in MPEG-4 AVC/H.264

Proceedings of the 2005 Asia and South Pacific Design Automation Conference
On the data reuse and memory bandwidth analysis for full-search block-matching VLSI architecture

IEEE Transactions on Circuits and Systems for Video Technology
Motion- and aliasing-compensated prediction for hybrid video coding

IEEE Transactions on Circuits and Systems for Video Technology
Adaptive deblocking filter

IEEE Transactions on Circuits and Systems for Video Technology
Context-based adaptive binary arithmetic coding in the H.264/AVC video compression standard

IEEE Transactions on Circuits and Systems for Video Technology
Level C+ data reuse scheme for motion estimation with corresponding coding orders

IEEE Transactions on Circuits and Systems for Video Technology
Analysis and architecture design of an HDTV720p 30 frames/s H.264/AVC encoder

IEEE Transactions on Circuits and Systems for Video Technology

An efficient VLSI processor chip for variable block size integer motion estimation in H.264/AVC

Image Communication
A reconfigurable architecture for multi-frame motion estimation

CSS'11 Proceedings of the 5th WSEAS international conference on Circuits, systems and signals
Algorithm and architecture design of bandwidth-oriented motion estimation for real-time mobile video applications

IEEE Transactions on Very Large Scale Integration (VLSI) Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

Variable block size motion estimation (VBSME) is one of several contributors to H.264/AVC's excelleut coding efficiency. However, its high computational complexity and huge memory traffic. make deign difficult. In this paper, we propose a memory-efficient and highly parallel VLSI architecture for full search VBSME (FSVBSME). Our architecture consists of 16 2-D arrays each consists of 16 ×16 processing elements (PEs). Four arrays form a group to match in parallel four reference blocks against one current block. Four groups perform block matching for four current blocks in a pipelined fashion. Taking advantage of overlapping among multiple reference blocks of a current block and between search windows of adjacent current blocks, we propose a novel data reuse scheme to reduce memory access. Compared with the popular Level C data reuse scheme, our approach can save 98% of on-chip memory access with only 25% of local memory overhead. Synthesized into a TSMC 180-nm CMOS cell library, our design is capable of processing 1920 × 1088 30 Cps video when running at 130 MHz. The architecture is scalable for wider search range, multiple reference frames and pixel truncation as well as down sampling. We suggest a criterion called design efficiency for comparing different works. It shows that the proposed desiig is 72% more efficient than the best design to date.