Hardware architecture design of an H.264/AVC video codec
ASP-DAC '06 Proceedings of the 2006 Asia and South Pacific Design Automation Conference
Architecture design for deblocking filter in H.264/JVT/AVC
ICME '03 Proceedings of the 2003 International Conference on Multimedia and Expo - Volume 2
Low-power H.264/AVC baseline decoder for portable applications
ISLPED '07 Proceedings of the 2007 international symposium on Low power electronics and design
An efficient architecture for adaptive deblocking filter of H.264/AVC video coding
IEEE Transactions on Consumer Electronics
A platform based bus-interleaved architecture for de-blocking filter in H.264/MPEG-4 AVC
IEEE Transactions on Consumer Electronics
Overview of the H.264/AVC video coding standard
IEEE Transactions on Circuits and Systems for Video Technology
Low-complexity transform and quantization in H.264/AVC
IEEE Transactions on Circuits and Systems for Video Technology
Complexity of optimized H.26L video decoder implementation
IEEE Transactions on Circuits and Systems for Video Technology
A Five-Stage Pipeline, 204 Cycles/MB, Single-Port SRAM-Based Deblocking Filter for H.264/AVC
IEEE Transactions on Circuits and Systems for Video Technology
A Bit-Rate Aware Scalable H.264/AVC Deblocking Filter Using Dynamic Partial Reconfiguration
Journal of Signal Processing Systems
DyPS: dynamic processor switching for energy-aware video decoding on multi-core SoCs
ACM SIGBED Review - Special Issue on the 3rd Embedded Operating System Workshop (EWiLi 2013)
Hi-index | 0.00 |
This paper presents methods for efficient optimization of ASIC implementation for H.264/AVC video decoding. A systematic approach in optimization is presented in a top-down flow. Tradeoffs among Power, Throughput, and Area (PTA) at both system level and block level are studied and balanced. The system architecture is first evaluated. We then focus on the pipeline organization, parallelism, and memory architecture optimization. Different pipeline granularities are compared and their pros-and-cons are evaluated. Various parallel scenarios, especially 1驴脳驴4-column and 4驴脳驴1-row, are analyzed and compared. Then the detailed designs of various building blocks, such as inverse transform, inter prediction, and deblocking filter, are evaluated and their intrinsic characteristics are exploited to facilitate PTA optimization. Finally, we provide the design guidelines for ASIC implementation based on the analysis and our design experiences of five dedicated decoder chips.