Rapid application specific floating-point unit generation with bit-alignment

Authors:
Yee Jern Chong;Sri Parameswaran
Affiliations:
The University of New South Wales, Sydney, Australia;The University of New South Wales, Sydney, Australia
Venue:
Proceedings of the 45th annual Design Automation Conference
Year:
2008

Citing 12
Cited 1

Quadratic zero-one programming based synthesis of application specific data paths

ICCAD '93 Proceedings of the 1993 IEEE/ACM international conference on Computer-aided design
Resource sharing in hierarchical synthesis

ICCAD '97 Proceedings of the 1997 IEEE/ACM international conference on Computer-aided design
Generalized resource sharing

ICCAD '97 Proceedings of the 1997 IEEE/ACM international conference on Computer-aided design
MediaBench: a tool for evaluating and synthesizing multimedia and communicatons systems

MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
Layout-driven resource sharing in high-level synthesis

Proceedings of the 2002 IEEE/ACM international conference on Computer-aided design
A method of automatic data path synthesis

DAC '83 Proceedings of the 20th Design Automation Conference
Computer Organization and Design

Computer Organization and Design
Area-efficient instruction set synthesis for reconfigurable system-on-chip designs

Proceedings of the 41st annual Design Automation Conference
Rapid Embedded Hardware/Software System Generation

VLSID '05 Proceedings of the 18th International Conference on VLSI Design held jointly with 4th International Conference on Embedded Systems Design
Applying Resource Sharing Algorithms to ADL-driven Automatic ASIP Implementation

ICCD '05 Proceedings of the 2005 International Conference on Computer Design
Automatic application specific floating-point unit generation

Proceedings of the conference on Design, automation and test in Europe
Efficient datapath merging for partially reconfigurable architectures

IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems

Improving Floating-Point Performance in Less Area: Fractured Floating Point Units (FFPUs)

Journal of Signal Processing Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

While ASIPs have allowed designers to create processors with custom instructions to target specific applications, floatingpoint units are still instantiated as fixed general-purpose units, which wastes area if not fully utilized. Therefore, there is a need for custom FPUs for embedded systems. The creation of a custom FPU requires the selection of a subset of the full floating-point instruction set and the implementation of this subset in hardware, such that the runtime of the application is minimized. To minimize area, it is desirable to merge the datapaths for each of the floating-point operations, so that redundant hardware is minimized. Floating-point datapaths are complex and contain components with varying bit-widths, so sharing components of different bit-widths is necessary. However, this introduces the problem of bit-alignment, which involves determining how smaller resources should be aligned within larger resources when merged. This is a problem that has been largely neglected in previous work. Thus, this paper presents a novel algorithm for solving the bit-alignment problem, which neatly integrates into the datapath merging process. By solving this bit-alignment problem, automatic datapath merging can be made available for FPU generation. To explore the trade-offs between area and performance, a rapid design space exploration was performed to determine which FP operations should be implemented in hardware rather than emulated. Our results show that more floating-point hardware does not necessarily equate to lower run-time if the additional hardware increases delay. We found that bit-alignment reduced area by an average of 22.5% in our benchmarks, compared to an average of 14.1% without bit-alignment.