Architecture and synthesis for multi-cycle communication
Proceedings of the 2003 international symposium on Physical design
Low-power high-level synthesis for FPGA architectures
Proceedings of the 2003 international symposium on Low power electronics and design
Application-specific instruction generation for configurable processor architectures
FPGA '04 Proceedings of the 2004 ACM/SIGDA 12th international symposium on Field programmable gate arrays
Architecture-level synthesis for automatic interconnect pipelining
Proceedings of the 41st annual Design Automation Conference
Gradual Relaxation Techniques with Applications to Behavioral Synthesis
Proceedings of the 2003 IEEE/ACM international conference on Computer-aided design
Architectural Synthesis Integrated with Global Placement for Multi-Cycle Communication
Proceedings of the 2003 IEEE/ACM international conference on Computer-aided design
Register binding and port assignment for multiplexer optimization
Proceedings of the 2004 Asia and South Pacific Design Automation Conference
Optimal module and voltage assignment for low-power
Proceedings of the 2005 Asia and South Pacific Design Automation Conference
Bitwidth-aware scheduling and binding in high-level synthesis
Proceedings of the 2005 Asia and South Pacific Design Automation Conference
Optimal simultaneous module and multivoltage assignment for low power
ACM Transactions on Design Automation of Electronic Systems (TODAES)
Optimality study of resource binding with multi-Vdds
Proceedings of the 43rd annual Design Automation Conference
Data-flow transformations using Taylor expansion diagrams
Proceedings of the conference on Design, automation and test in Europe
Compatibility path based binding algorithm for interconnect reduction in high level synthesis
Proceedings of the 2007 IEEE/ACM international conference on Computer-aided design
Simultaneous FU and register binding based on network flow method
Proceedings of the conference on Design, automation and test in Europe
Proceedings of the 2009 Asia and South Pacific Design Automation Conference
ACM Transactions on Design Automation of Electronic Systems (TODAES)
Better than optimum?: register reduction using idle pipelined functional units
Proceedings of the 19th ACM Great Lakes symposium on VLSI
Optimization of data-flow computations using canonical TED representation
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
A functional unit and register binding algorithm for interconnect reduction
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
A global interconnect reduction technique during high level synthesis
Proceedings of the 2010 Asia and South Pacific Design Automation Conference
A CAD framework for Malibu: an FPGA with time-multiplexed coarse-grained elements
Proceedings of the 19th ACM/SIGDA international symposium on Field programmable gate arrays
Word-Length Aware DSP Hardware Design Flow Based on High-Level Synthesis
Journal of Signal Processing Systems
Variation-aware layout-driven scheduling for performance yield optimization
Proceedings of the International Conference on Computer-Aided Design
Rapid Synthesis and Simulation of Computational Circuits in an MPPA
Journal of Signal Processing Systems
Fast and effective placement and routing directed high-level synthesis for FPGAs
Proceedings of the 2014 ACM/SIGDA international symposium on Field-programmable gate arrays
Hi-index | 0.00 |
Although throughput alone can be arbitrarily improved for several classes of systems using previously published techniques, none of those approaches are effective when latency constraints, which are increasingly important in embedded DSP systems, are considered. After formally establishing the relationship between latency and throughput in general computation, we explore the effect of pipelining on latency, and establish necessary and sufficient conditions under which pipelining does not alter latency. Many systems are either linear, or have subsystems that are linear. For such cases we have used a state-space based approach that treats various transformations in an integrated fashion, and answers analytically whether it is possible to simultaneously meet any given combination of constraints on latency and throughput, The analytic approach is constructive in nature, and produces a complete implementation when feasibility conditions are fulfilled. We also present a suboptimal but hardware efficient heuristic approach for the special case of initially-relaxed single-input single-output linear time-invariant computations. A novel software platform consisting of a high-level synthesis system coupled to a symbolic algebra system was used to implement the proposed algorithm transformations. Instead of optimizing to improve throughput and latency, our transformations can also be used to increase the implementation efficiency while achieving the same latency and throughput as the original design.