Automated dynamic throughput-constrained structural-level pipelining in streaming applications

Authors:
Mark Muir;Tughrul Arslan;Iain Lindsay
Affiliations:
The Universtiy of Edinburgh, Edinburgh, United Kingdom;The Universtiy of Edinburgh, Edinburgh, United Kingdom and Institute for System Level Integration, Alba Centre, Livingston, United Kingdom;The Universtiy of Edinburgh, Edinburgh, United Kingdom
Venue:
Proceedings of the conference on Design, automation and test in Europe
Year:
2008

Citing 9
Cited 0

Software pipelining: an effective scheduling technique for VLIW machines

PLDI '88 Proceedings of the ACM SIGPLAN 1988 conference on Programming Language design and Implementation
Partitioning and pipelining for performance-constrained hardware/software systems

IEEE Transactions on Very Large Scale Integration (VLSI) Systems
The Effectiveness of Loop Unrolling for Modulo Scheduling in Clustered VLIW Architectures

ICPP '00 Proceedings of the Proceedings of the 2000 International Conference on Parallel Processing
Area and Throughput Trade-Offs in the Design of Pipelined Discrete Wavelet Transform Architectures

Proceedings of the conference on Design, Automation and Test in Europe - Volume 3
System-level scheduling on instruction cell based reconfigurable systems

Proceedings of the conference on Design, automation and test in Europe: Proceedings
H.264/AVC In-Loop De-Blocking Filter Targeting a Dynamically Reconfigurable Instruction Cell Based Architecture

AHS '07 Proceedings of the Second NASA/ESA Conference on Adaptive Hardware and Systems
A Structural Object Programming Model, Architecture, Chip and Tools for Reconfigurable Computing

FCCM '07 Proceedings of the 15th Annual IEEE Symposium on Field-Programmable Custom Computing Machines
Implementation of a Real Time Programmable Encoder for Low Density Parity Check Code on a Reconfigurable Instruction Cell Architecture

ASP-DAC '07 Proceedings of the 2007 Asia and South Pacific Design Automation Conference
The reconfigurable instruction cell array

IEEE Transactions on Very Large Scale Integration (VLSI) Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

Stream processing applications such as image signal processing demand high throughput. However, customers increasingly demand runtime flexibility in their designs, which cannot be provided by custom ASIC solutions. Currently, reconfigurable processors tend to offer insufficient throughput for widespread use in streaming applications. This paper demonstrates how structural-level pipelining techniques can be applied to rapidly dynamically reconfigurable computing architectures, in order to increase throughput. This is done by automatically inserting registers into the data path of performance critical code sections that have already been optimised into a single configuration context. A new algorithm is presented to choose the insertion point of pipeline stage registers in order to meet a specified throughput whilst minimising register resource usage. The paper then demonstrates a new approach where properties of dynamic reconfiguration can be utilised to perform the tasks of pipeline stage initialisation and flushing. The technique is demonstrated on a real-life application: the demosaic filter in a standard image signal processing pipe used in modern digital cameras, and can be seen to boost the throughput from 16MPixels/s to 51MPixels/s on an example reconfigurable processor.