Optimum and heuristic transformation techniques for simultaneous optimization of latency and throughput

Authors:
Mani B. Srivastava;Miodrag Potkonjak
Affiliations:
-;-
Venue:
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Year:
1995

Citing 0
Cited 26

Architecture and synthesis for multi-cycle communication

Proceedings of the 2003 international symposium on Physical design
Low-power high-level synthesis for FPGA architectures

Proceedings of the 2003 international symposium on Low power electronics and design
Application-specific instruction generation for configurable processor architectures

FPGA '04 Proceedings of the 2004 ACM/SIGDA 12th international symposium on Field programmable gate arrays
Architecture-level synthesis for automatic interconnect pipelining

Proceedings of the 41st annual Design Automation Conference
Gradual Relaxation Techniques with Applications to Behavioral Synthesis

Proceedings of the 2003 IEEE/ACM international conference on Computer-aided design
Architectural Synthesis Integrated with Global Placement for Multi-Cycle Communication

Proceedings of the 2003 IEEE/ACM international conference on Computer-aided design
Register binding and port assignment for multiplexer optimization

Proceedings of the 2004 Asia and South Pacific Design Automation Conference
Optimal module and voltage assignment for low-power

Proceedings of the 2005 Asia and South Pacific Design Automation Conference
Bitwidth-aware scheduling and binding in high-level synthesis

Proceedings of the 2005 Asia and South Pacific Design Automation Conference
Optimal simultaneous module and multivoltage assignment for low power

ACM Transactions on Design Automation of Electronic Systems (TODAES)
Optimality study of resource binding with multi-Vdds

Proceedings of the 43rd annual Design Automation Conference
Data-flow transformations using Taylor expansion diagrams

Proceedings of the conference on Design, automation and test in Europe
Compatibility path based binding algorithm for interconnect reduction in high level synthesis

Proceedings of the 2007 IEEE/ACM international conference on Computer-aided design
Simultaneous FU and register binding based on network flow method

Proceedings of the conference on Design, automation and test in Europe
FastYield: variation-aware, layout-driven simultaneous binding and module selection for performance yield optimization

Proceedings of the 2009 Asia and South Pacific Design Automation Conference
Simultaneous resource binding and interconnection optimization based on a distributed register-file microarchitecture

ACM Transactions on Design Automation of Electronic Systems (TODAES)
Better than optimum?: register reduction using idle pipelined functional units

Proceedings of the 19th ACM Great Lakes symposium on VLSI
Optimization of data-flow computations using canonical TED representation

IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
A functional unit and register binding algorithm for interconnect reduction

IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
LOPASS: a low-power architectural synthesis system for FPGAs with interconnect estimation and optimization

IEEE Transactions on Very Large Scale Integration (VLSI) Systems
A global interconnect reduction technique during high level synthesis

Proceedings of the 2010 Asia and South Pacific Design Automation Conference
A CAD framework for Malibu: an FPGA with time-multiplexed coarse-grained elements

Proceedings of the 19th ACM/SIGDA international symposium on Field programmable gate arrays
Word-Length Aware DSP Hardware Design Flow Based on High-Level Synthesis

Journal of Signal Processing Systems
Variation-aware layout-driven scheduling for performance yield optimization

Proceedings of the International Conference on Computer-Aided Design
Rapid Synthesis and Simulation of Computational Circuits in an MPPA

Journal of Signal Processing Systems
Fast and effective placement and routing directed high-level synthesis for FPGAs

Proceedings of the 2014 ACM/SIGDA international symposium on Field-programmable gate arrays

Quantified Score

Hi-index	0.00

Visualization

Abstract

Although throughput alone can be arbitrarily improved for several classes of systems using previously published techniques, none of those approaches are effective when latency constraints, which are increasingly important in embedded DSP systems, are considered. After formally establishing the relationship between latency and throughput in general computation, we explore the effect of pipelining on latency, and establish necessary and sufficient conditions under which pipelining does not alter latency. Many systems are either linear, or have subsystems that are linear. For such cases we have used a state-space based approach that treats various transformations in an integrated fashion, and answers analytically whether it is possible to simultaneously meet any given combination of constraints on latency and throughput, The analytic approach is constructive in nature, and produces a complete implementation when feasibility conditions are fulfilled. We also present a suboptimal but hardware efficient heuristic approach for the special case of initially-relaxed single-input single-output linear time-invariant computations. A novel software platform consisting of a high-level synthesis system coupled to a symbolic algebra system was used to implement the proposed algorithm transformations. Instead of optimizing to improve throughput and latency, our transformations can also be used to increase the implementation efficiency while achieving the same latency and throughput as the original design.