Designing real-time H.264 decoders with dataflow architectures

Authors:
Youngsoo Kim;Suleyman Sair
Affiliations:
NC State University;NC State University
Venue:
CODES+ISSS '05 Proceedings of the 3rd IEEE/ACM/IFIP international conference on Hardware/software codesign and system synthesis
Year:
2005

Citing 10
Cited 4

ATOM: a system for building customized program analysis tools

PLDI '94 Proceedings of the ACM SIGPLAN 1994 conference on Programming language design and implementation
MorphoSys: An Integrated Reconfigurable System for Data-Parallel and Computation-Intensive Applications

IEEE Transactions on Computers
Automatic application-specific instruction-set extensions under microarchitectural constraints

Proceedings of the 40th annual Design Automation Conference
Automatic Topology-Based Identification of Instruction-Set Extensions for Embedded Processors

Proceedings of the conference on Design, automation and test in Europe
WaveScalar

Proceedings of the 36th annual IEEE/ACM International Symposium on Microarchitecture
Design Methodology for a Tightly Coupled VLIW/Reconfigurable Matrix Architecture: A Case Study

Proceedings of the conference on Design, automation and test in Europe - Volume 2
Flexible architectures for engineering successful SOCs

Proceedings of the 41st annual Design Automation Conference
Fast Cycle-accurate Behavioral Simulation for Pipelined Processors Using Early Pipeline Evaluation

Proceedings of the 2003 IEEE/ACM international conference on Computer-aided design
Application-Specific Processing on a General-Purpose Core via Transparent Instruction Set Customization

Proceedings of the 37th annual IEEE/ACM International Symposium on Microarchitecture
A novel methodology for the design of application-specific instruction-set processors (ASIPs) using a machine description language

IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems

Heterogeneous associative cache for multimedia applications

IMSA'07 IASTED European Conference on Proceedings of the IASTED European Conference: internet and multimedia systems and applications
Heterogeneous associative cache for multimedia applications

EurolMSA '07 Proceedings of the Third IASTED European Conference on Internet and Multimedia Systems and Applications
Chip multiprocessor based on data-driven multithreading model

International Journal of High Performance Systems Architecture
A fault tolerant NoC architecture using quad-spare mesh topology and dynamic reconfiguration

Journal of Systems Architecture: the EUROMICRO Journal

Quantified Score

Hi-index	0.00

Visualization

Abstract

High performance microprocessors are designed with general-purpose applications in mind. When it comes to embedded applications, these architectures typically perform control-intensive tasks in a System-on-Chip (SoC) design. But they are significantly inefficient for data-intensive tasks such as video encoding/decoding. Although configurable processors fill this gap by complementing the existing functional units with instruction extensions, their performance lags behind the needs of real-time embedded tasks. In this paper, we evaluate the performance potential of a dataflow processor for H.264 video decoding. We first profile the H.264 application to capture the amount of data traffic among modules. We use this information to guide the placement of H.264 modules in the WaveScalar dataflow architecture. A simulated annealing based placement algorithm produces the final placement aiming to optimize the communication costs between the modules in the dataflow architecture. In addition to outperforming contemporary embedded and customized processors, our simulated annealing guided design shows a speedup of 13% in execution time over the original WaveScalar architecture. With our dataflow design methodology, emerging embedded applications requiring several GOPS to meet real-time constraints can be drafted within a reasonable amount of design time.