Floating-point to fixed-point conversion

  • Authors:
  • Changchun Shi;Robert W. Brodersen

  • Affiliations:
  • University of California, Berkeley;University of California, Berkeley

  • Venue:
  • Floating-point to fixed-point conversion
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

The digital signal processing (DSP) algorithms used by communication systems are typically specified as floating-point or, ideally, infinite precision operations. On the other hand, digital VLSI implementations of these algorithms rely on fixed-point approximations to reduce cost of hardware while increasing throughput rates. One essential step of a top-down design flow is to determine the fixed-point data type of each signal node, namely the word-length, truncation mode and overflow mode. This is commonly referred as floating-point to fixed-point conversion (FFC) problem. Conventional approaches are typically both time-consuming and error-prone since ad-hoc assignments of fixed-point data type are performed manually and iteratively. We first formulate FFC problem into an optimization framework. The optimization variables are defined by the fixed-point data-types to be determined; the objective function is hardware cost, and the constraint functions are system specifications. In this unified point of view the past techniques are compared. A primary goal is to make the optimization automatic and fast which requires an understanding of the relationships between these functions and the variables. One critical step is the identification of the right metric that judges the quality of an FFC and is sufficiently general. This metric is directly related to quantization effects and will serve as the constraint functions. We first categorize functional blocks in a system according to their quantization behavior; then, a novel statistical perturbation theory provides the guideline of using simulations to obtain constraint functions in their semi-analytical form. The theoretical work reduces the otherwise exponential complexity of characterizing quantization effects to a polynomial one. The other critical step to achieve automated FFC is the automatic acquisition of hardware-cost function. This has been done using a high level resource estimation tool and function-fitting method. Based on the preceding methodology, an FFC tool in Matlab and Simulink environment has been built for Xilinx FPGA designs as a demonstration. The FFC tool has been successfully tested on several complicated digital designs—namely a binary phase shift keying (BPSK) transceiver, a U-Sigma block of singular value decomposition (SVD) system and an Ultra-wide band (UWB) system. The conversions normally take from minutes to hours, varying according to system complexity. These are orders of magnitudes faster than existing tools, which are projected to take weeks to do the conversions. Without reducing system performance, the FFC can reduce their hardware-costs by 1.5 to 50 times. The hardware resource estimation part of our FFC utility is based on my summer intern project in Xilinx, Inc. Unlike existing resource estimations that rely on post-netlisting information or post-placement-and-routing map report, this pre-netlisting estimator (now part of System Generator 3.1) in Matlab environment speeds up estimations by 2–3 orders of magnitudes. The proposed FFC methodology can also be applied to ASIC design when hardware cost is chip area, power consumption, and so on. One necessary pre-requisite is a similar hardware estimation tool and hardware cost function model.