Outer loop pipelining for application specific datapaths in FPGAs

Authors:
Kieron Turkington;George A. Constantinides;Konstantinos Masselos;Peter Y. K. Cheung
Affiliations:
Department of Electrical and Electronic Engineering, Imperial College London, London, UK;Department of Electrical and Electronic Engineering, Imperial College London, London, UK;Department of Computer Science and Technology, University of Peloponnese, Tripolis, Greece;Department of Electrical and Electronic Engineering, Imperial College London, London, UK
Venue:
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Year:
2008

Citing 8
Cited 3

Definitions of dependence distance

ACM Letters on Programming Languages and Systems (LOPLAS)
Software pipelining

ACM Computing Surveys (CSUR)
Successive overrelaxation (SOR) and related methods

Journal of Computational and Applied Mathematics - Special issue on numerical analysis 2000 Vol. III: linear algebra
Control Mechanism for Software Pipelining on Nested Loop

APDC '97 Proceedings of the 1997 Advances in Parallel and Distributed Computing Conference (APDC '97)
Pipeline Vectorization for Reconfigurable Systems

FCCM '99 Proceedings of the Seventh Annual IEEE Symposium on Field-Programmable Custom Computing Machines
Co-Processor Synthesis: A New Methodology for Embedded Software Acceleration

Proceedings of the conference on Design, automation and test in Europe - Volume 1
Single-Dimension Software Pipelining for Multi-Dimensional Loops

Proceedings of the international symposium on Code generation and optimization: feedback-directed and runtime optimization
Optimized Generation of Data-Path from C Codes for FPGAs

Proceedings of the conference on Design, Automation and Test in Europe - Volume 1

Combining optimizations in automated low power design

Proceedings of the Conference on Design, Automation and Test in Europe
Automated Mapping of the MapReduce Pattern onto Parallel Computing Platforms

Journal of Signal Processing Systems
Elastic CGRAs

Proceedings of the ACM/SIGDA international symposium on Field programmable gate arrays

Quantified Score

Hi-index	0.00

Visualization

Abstract

Most hardware compilers apply loop pipelining to increase the parallelism achieved, but pipelining is restricted to the only innermost level in a nested loop. In this work we extend and adapt an existing outer loop pipelining approach known as single dimension software pipelining to generate schedules for field-programmable gate-array (FPGA) hardware coprocessors. Each loop level in nine test loops is pipelined and the resulting schedules are implemented in VHDL and targeted to an Altera Stratix II FPGA. The results show that the fastest solution for all but one of the loops occurs when pipelining is applied one to three levels above the innermost loop. Across the nine test loops we achieve an acceleration over the innermost loop solution of up to seven times, with a mean speedup of 3.2 times. The results suggest that inclusion of outer loop pipelining in future hardware compilers may be worthwhile as it can allow significantly improved results to be achieved at the cost of a small increase in compile time.