Automatic Task Scheduling / Loop Unrolling using Dedicated RTR Controllers in Coarse Grain Reconfigurable Architectures

Authors:
Pascal Benoit;Lionel Torres;Gilles Sassatelli;Michel Robert;Gaston Cambon
Affiliations:
LIRMM;LIRMM;LIRMM;LIRMM;LIRMM
Venue:
IPDPS '05 Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Workshop 3 - Volume 04
Year:
2005

Citing 9
Cited 1

Simultaneous multithreading: maximizing on-chip parallelism

ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
Improving functional density using run-time circuit reconfiguration

IEEE Transactions on Very Large Scale Integration (VLSI) Systems
A decade of reconfigurable computing: a visionary retrospective

Proceedings of the conference on Design, automation and test in Europe
A comparative study of modulo scheduling techniques

ICS '02 Proceedings of the 16th international conference on Supercomputing
FPGA and CPLD Architectures: A Tutorial

IEEE Design & Test
The Systolic Ring: A Dynamically Reconfigurable Architecture for Embedded Systems

FPL '01 Proceedings of the 11th International Conference on Field-Programmable Logic and Applications
Uses and abuses of Amdahl's law

Journal of Computing Sciences in Colleges
A Lightweight Approach for Embedded Reconfiguration of FPGAs

DATE '03 Proceedings of the conference on Design, Automation and Test in Europe - Volume 1
Lava and JBits: From HDL to Bitstream in Seconds

FCCM '01 Proceedings of the the 9th Annual IEEE Symposium on Field-Programmable Custom Computing Machines

An overview of reconfigurable hardware in embedded systems

EURASIP Journal on Embedded Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

When designing a SoC, matching the required performance both in terms of processing power and power consumption tends to become more and more challenging. Moreover, since the range of targeted applications for every single product is growing rapidly, employing reconfigurable accelerators makes more and more sense to this purpose. Coarse grain reconfigurable architectures bring an alternative providing interesting performance / flexibility trade-offs over traditional approaches. This paper presents an original method allowing to efficiently exploit dynamical parallelism at both loop-level and task-level, which remains rarely used. This method called DHM (Dynamic Hardware Multiplexing) is based upon the use of a hardwired controller dedicated to run-time task scheduling and automatic loop unrolling. This paper shows that significant performance improvements can be achieved through combining both intra and inter-task parallelism. Principles and validations are exposed through a case study on a coarse grain reconfigurable architecture.