A memory-efficient and highly parallel architecture for variable block size integer motion estimation in H.264/AVC

  • Authors:
  • Chao-Yang Kao;Youn-Long Lin

  • Affiliations:
  • Department of Computer Science, National Tsing Hua University, HsinChu, Taiwan;Department of Computer Science, National Tsing Hua University, HsinChu, Taiwan

  • Venue:
  • IEEE Transactions on Very Large Scale Integration (VLSI) Systems
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Variable block size motion estimation (VBSME) is one of several contributors to H.264/AVC's excelleut coding efficiency. However, its high computational complexity and huge memory traffic. make deign difficult. In this paper, we propose a memory-efficient and highly parallel VLSI architecture for full search VBSME (FSVBSME). Our architecture consists of 16 2-D arrays each consists of 16 ×16 processing elements (PEs). Four arrays form a group to match in parallel four reference blocks against one current block. Four groups perform block matching for four current blocks in a pipelined fashion. Taking advantage of overlapping among multiple reference blocks of a current block and between search windows of adjacent current blocks, we propose a novel data reuse scheme to reduce memory access. Compared with the popular Level C data reuse scheme, our approach can save 98% of on-chip memory access with only 25% of local memory overhead. Synthesized into a TSMC 180-nm CMOS cell library, our design is capable of processing 1920 × 1088 30 Cps video when running at 130 MHz. The architecture is scalable for wider search range, multiple reference frames and pixel truncation as well as down sampling. We suggest a criterion called design efficiency for comparing different works. It shows that the proposed desiig is 72% more efficient than the best design to date.