A reconfigurable arithmetic array for multimedia applications
FPGA '99 Proceedings of the 1999 ACM/SIGDA seventh international symposium on Field programmable gate arrays
Fast MPEG-4 Motion Estimation: Processor Based and Flexible VLSI Implementations
Journal of VLSI Signal Processing Systems - Special issue on implementation of MPEG-4 multimedia codecs
Performance of H.26L Video Encoder on General-Purpose Processor
Journal of VLSI Signal Processing Systems
A fast VLSI architecture for full-search variable block size motion estimation in MPEG-4 AVC/H.264
Proceedings of the 2005 Asia and South Pacific Design Automation Conference
Serial and Parallel FPGA-based Variable Block Size Motion Estimation Processors
Journal of Signal Processing Systems
An efficient VLSI architecture for H.264 variable block size motion estimation
IEEE Transactions on Consumer Electronics
IEEE Transactions on Circuits and Systems for Video Technology
Quadtree-structured variable-size block-matching motion estimation with minimal error
IEEE Transactions on Circuits and Systems for Video Technology
A novel low-power full-search block-matching motion-estimation design for H.263+
IEEE Transactions on Circuits and Systems for Video Technology
Overview of the H.264/AVC video coding standard
IEEE Transactions on Circuits and Systems for Video Technology
Analysis and architecture design of an HDTV720p 30 frames/s H.264/AVC encoder
IEEE Transactions on Circuits and Systems for Video Technology
Fast Algorithm and Architecture Design of Low-Power Integer Motion Estimation for H.264/AVC
IEEE Transactions on Circuits and Systems for Video Technology
A novel modular systolic array architecture for full-search block matching motion estimation
IEEE Transactions on Circuits and Systems for Video Technology
VLSI architecture for a flexible block matching processor
IEEE Transactions on Circuits and Systems for Video Technology
An Adaptive Motion Estimation Architecture for H.264/AVC
Journal of Signal Processing Systems
Hi-index | 0.00 |
In H.264/AVC, the motion estimation (ME) routine supports variable block size and involves highly parallel sum of absolute difference (SAD) computations. In this study, we introduce a bit serial hybrid-grained processing element (PE) based 2D architecture that has both early termination and intensive data reuse capabilities. PEs operate on most significant bit-first arithmetic for early termination and the 2D architecture enables on-chip data reuse between neighboring PEs in a bit-by-bit pipelined fashion. Hybrid-grained PEs reduce the hardware overhead of conventional adder tree structures used for implementing the variable block size ME. Our design reduces the gate count by 7x compared to its ASIC counterpart, operates at a comparable frequency while sustaining 30 fps and 60 fps; and outperforms bit parallel and bit serial architectures in terms of throughput and performance per gate for various video formats.