Profile-guided floating- to fixed-point conversion for hybrid FPGA-processor applications

Authors:
Doris Chen;Deshanand Singh
Affiliations:
Altera Corporation;Altera Corporation
Venue:
ACM Transactions on Architecture and Code Optimization (TACO) - Special Issue on High-Performance Embedded Architectures and Compilers
Year:
2013

Citing 10
Cited 0

Overcoming the challenges to feedback-directed optimization (Keynote Talk)

DYNAMO '00 Proceedings of the ACM SIGPLAN workshop on Dynamic and adaptive compilation and optimization
Floating Point to Fixed Point Conversion of C Code

CC '99 Proceedings of the 8th International Conference on Compiler Construction, Held as Part of the European Joint Conferences on the Theory and Practice of Software, ETAPS'99
A Library of Parameterized Floating-Point Modules and Their Use

FPL '02 Proceedings of the Reconfigurable Computing Is Going Mainstream, 12th International Conference on Field-Programmable Logic and Applications
Synchronous Interlocked Pipelines

ASYNC '02 Proceedings of the 8th International Symposium on Asynchronus Circuits and Systems
LLVM: A Compilation Framework for Lifelong Program Analysis & Transformation

Proceedings of the international symposium on Code generation and optimization: feedback-directed and runtime optimization
Unifying Bit-Width Optimisation for Fixed-Point and Floating-Point Designs

FCCM '04 Proceedings of the 12th Annual IEEE Symposium on Field-Programmable Custom Computing Machines
When FPGAs are better at floating-point than microprocessors

Proceedings of the 16th international ACM/SIGDA symposium on Field programmable gate arrays
3D finite difference computation on GPUs using CUDA

Proceedings of 2nd Workshop on General Purpose Processing on Graphics Processing Units
Teraflop FPGA Design

ARITH '11 Proceedings of the 2011 IEEE 20th Symposium on Computer Arithmetic
A performance and energy comparison of FPGAs, GPUs, and multicores for sliding-window applications

Proceedings of the ACM/SIGDA international symposium on Field Programmable Gate Arrays

Quantified Score

Hi-index	0.00

Visualization

Abstract

The key to enabling widespread use of FPGAs for algorithm acceleration is to allow programmers to create efficient designs without the time-consuming hardware design process. Programmers are used to developing scientific and mathematical algorithms in high-level languages (C/C++) using floating point data types. Although easy to implement, the dynamic range provided by floating point is not necessary in many applications; more efficient implementations can be realized using fixed point arithmetic. While this topic has been studied previously [Han et al. 2006; Olson et al. 1999; Gaffar et al. 2004; Aamodt and Chow 1999], the degree of full automation has always been lacking. We present a novel design flow for cases where FPGAs are used to offload computations from a microprocessor. Our LLVM-based algorithm inserts value profiling code into an unmodified C/C++ application to guide its automatic conversion to fixed point. This allows for fast and accurate design space exploration on a host microprocessor before any accelerators are mapped to the FPGA. Through experimental results, we demonstrate that fixed-point conversion can yield resource savings of up to 2x--3x reductions. Embedded RAM usage is minimized, and 13%--22% higher Fmax than the original floating-point implementation is observed. In a case study, we show that 17% reduction in logic and 24% reduction in register usage can be realized by using our algorithm in conjunction with a High-Level Synthesis (HLS) tool.