Mapping control-intensive video kernels onto a coarse-grain reconfigurable architecture: the H.264/AVC deblocking filter

Authors:
C. Arbelo;A. Kanstein;S. López;J. F. López;M. Berekovic;R. Sarmiento;J.-Y. Mignolet
Affiliations:
University of Las Palmas de Gran Canaria, Spain;Freescale Inc., Toulouse, France;University of Las Palmas de Gran Canaria, Spain;University of Las Palmas de Gran Canaria, Spain;IMEC, Leuven, Belgium;University of Las Palmas de Gran Canaria, Spain;IMEC, Leuven, Belgium
Venue:
Proceedings of the conference on Design, automation and test in Europe
Year:
2007

Citing 3
Cited 8

IMPACT: an architectural framework for multiple-instruction-issue processors

ISCA '91 Proceedings of the 18th annual international symposium on Computer architecture
Exploiting instruction level parallelism in the presence of conditional branches

Exploiting instruction level parallelism in the presence of conditional branches
Exploiting Loop-Level Parallelism on Coarse-Grained Reconfigurable Architectures Using Modulo Scheduling

DATE '03 Proceedings of the conference on Design, Automation and Test in Europe - Volume 1

Hierarchical reconfigurable computing arrays for efficient CGRA-based embedded systems

Proceedings of the 46th Annual Design Automation Conference
Coarse-grained reconfigurable architecture for multiple application domains: a case study

Proceedings of the 2009 International Conference on Hybrid Information Technology
Optimizing the H.264/AVC Video Encoder Application Structure for Reconfigurable and Application-Specific Platforms

Journal of Signal Processing Systems
A Bit-Rate Aware Scalable H.264/AVC Deblocking Filter Using Dynamic Partial Reconfiguration

Journal of Signal Processing Systems
Power-Efficient Predication Techniques for Acceleration of Control Flow Execution on CGRA

ACM Transactions on Architecture and Code Optimization (TACO)
Compiling control-intensive loops for CGRAs with state-based full predication

Proceedings of the Conference on Design, Automation and Test in Europe
State-based full predication for low power coarse-grained reconfigurable architecture

DATE '12 Proceedings of the Conference on Design, Automation and Test in Europe
Design of a coarse-grained reconfigurable architecture with floating-point support and comparative study

Integration, the VLSI Journal

Quantified Score

Hi-index	0.00

Visualization

Abstract

Deblocking filtering represents one of the most compute intensive tasks in an H.264/AVC standard video decoder due to its demanding memory accesses and irregular data flow. For these reasons, an efficient implementation poses big challenges, especially for programmable platforms. In this sense, the mapping of this decoder's functionality onto a C-programmable coarse-grained reconfigurable architecture named ADRES (Architecture for Dynamically Reconfigurable Embedded Systems) is presented in this paper, including results from the evaluation of different topologies. The results obtained show a considerable reduction in the number of cycles and memory accesses needed to perform the filtering as well as an increase in the degree of instruction parallelism (ILP) when compared with an implementation on a Very Long Instruction Word (VLIW) dedicated processor. This demonstrates that high ILP is achievable on the ADRES even for irregular, data-dependent kernels.