Datapath design for a VLIW Video Signal Processor

Authors:
A. Wolfe;J. Fritts;S. Dutta;E. S. T. Fernandes
Affiliations:
-;-;-;-
Venue:
HPCA '97 Proceedings of the 3rd IEEE Symposium on High-Performance Computer Architecture
Year:
1997

Citing 0
Cited 9

Available paralellism in video applications

MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
Trace-driven studies of VLIW video signal processors

Proceedings of the tenth annual ACM symposium on Parallel algorithms and architectures
Data-path synthesis of VLIW video signal processors

Proceedings of the 11th international symposium on System synthesis
Dynamic Parallel media processing using Speculative Broadcast Loop (SBL)

IPDPS '01 Proceedings of the 15th International Parallel & Distributed Processing Symposium
A Low Energy Clustered Instruction Memory Hierarchy for Long Instruction Word Processors

PATMOS '02 Proceedings of the 12th International Workshop on Integrated Circuit Design. Power and Timing Modeling, Optimization and Simulation
Efficient orchestration of sub-word parallelism in media processors

Proceedings of the sixteenth annual ACM symposium on Parallelism in algorithms and architectures
Tile size selection for low-power tile-based architectures

Proceedings of the 3rd conference on Computing frontiers
Synchroscalar: Evaluation of an embedded, multi-core architecture for media applications

Journal of Embedded Computing - Issues in embedded single-chip multicore architectures
Using Application Bisection Bandwidth to Guide Tile Size Selection for the Synchroscalar Tile-Based Architecture

Transactions on High-Performance Embedded Architectures and Compilers I

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper represents a design study of the datapath for a very long instruction word (VLIW) video signal processor (VSP). VLIW architectures provide high parallelism and excellent high-level language programmability, but require careful attention to VLSI and compiler design. Flexible, high-bandwidth interconnect, high-connectivity register files, and fast cycle times are required to achieve real-time video signal processing. Parameterizable versions of key modules have been designed in a 0.25 /spl mu/m process, allowing us to explore tradeoffs in the VLIW VSP design space. The designs target 33 operations per cycle at clock rates exceeding 600 MHz. Various VLIW code scheduling techniques have been applied to 6 VSP kernels and evaluated on 7 different candidate datapath designs. The results of these simulations are used to indicate which architectural tradeoffs enhance overall performance in this application domain.