Architecture-level synthesis for automatic interconnect pipelining
Proceedings of the 41st annual Design Automation Conference
Platform-based resource binding using a distributed register-file microarchitecture
Proceedings of the 2006 IEEE/ACM international conference on Computer-aided design
FPGA Design Automation: A Survey
Foundations and Trends in Electronic Design Automation
Compatibility path based binding algorithm for interconnect reduction in high level synthesis
Proceedings of the 2007 IEEE/ACM international conference on Computer-aided design
A multicycle communication architecture and synthesis flow for global interconnect resource sharing
Proceedings of the 2008 Asia and South Pacific Design Automation Conference
Proceedings of the 2009 Asia and South Pacific Design Automation Conference
Proceedings of the 2009 Asia and South Pacific Design Automation Conference
ACM Transactions on Design Automation of Electronic Systems (TODAES)
Power optimization with power islands synthesis
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
Coordinated resource optimization in behavioral synthesis
Proceedings of the Conference on Design, Automation and Test in Europe
Word-Length Aware DSP Hardware Design Flow Based on High-Level Synthesis
Journal of Signal Processing Systems
Towards layout-friendly high-level synthesis
Proceedings of the 2012 ACM international symposium on International Symposium on Physical Design
A metric for layout-friendly microarchitecture optimization in high-level synthesis
Proceedings of the 49th Annual Design Automation Conference
Fast and effective placement and routing directed high-level synthesis for FPGAs
Proceedings of the 2014 ACM/SIGDA international symposium on Field-programmable gate arrays
Hi-index | 0.03 |
For multigigahertz designs in nanometer technologies, data transfers on global interconnects take multiple clock cycles. In this paper, we propose a regular distributed register (RDR) microarchitecture, which offers high regularity and direct support of multicycle on-chip communication. The RDR microarchitecture divides the entire chip into an array of islands so that all local computation and communication within an island can be performed in a single clock cycle. Each island contains a cluster of computational elements, local registers, and a local controller. On top of the RDR microarchitecture, novel layout-driven architectural synthesis algorithms have been developed for multicycle communication, including scheduling-driven placement, placement-driven simultaneous scheduling with rebinding, and distributed control generation, etc. The experimentation on a number of real-life examples demonstrates promising results. For data flow intensive examples, we obtain a 44% improvement on average in terms of the clock period and a 37% improvement on average in terms of the final latency, over the traditional flow. For designs with control flow, our approach achieves a 28% clock-period reduction and a 23% latency reduction on average.