Acyclic multi-way partitioning of Boolean networks
DAC '94 Proceedings of the 31st annual Design Automation Conference
Network flow based circuit partitioning for time-multiplexed FPGAs
Proceedings of the 1998 IEEE/ACM international conference on Computer-aided design
Performance analysis and optimization of latency insensitive systems
Proceedings of the 37th Annual Design Automation Conference
Coping with Latency in SOC Design
IEEE Micro
Local unidirectional bias for smooth cutsize-delay tradeoff in performance-driven bipartitioning
Proceedings of the 2003 international symposium on Physical design
Profile-guided microarchitectural floorplanning for deep submicron processor design
Proceedings of the 41st annual Design Automation Conference
Microarchitecture-aware floorplanning using a statistical design of experiments approach
Proceedings of the 42nd annual Design Automation Conference
Processing Rate Optimization by Sequential System Floorplanning
ISQED '06 Proceedings of the 7th International Symposium on Quality Electronic Design
Synthesis of synchronous elastic architectures
Proceedings of the 43rd annual Design Automation Conference
Microarchitecture configurations and floorplanning co-optimization
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
MEMOCODE '07 Proceedings of the 5th IEEE/ACM International Conference on Formal Methods and Models for Codesign
Correct-by-construction microarchitectural pipelining
Proceedings of the 2008 IEEE/ACM International Conference on Computer-Aided Design
Clustering for processing rate optimization
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Bounded dataflow networks and latency-insensitive circuits
MEMOCODE'09 Proceedings of the 7th IEEE/ACM international conference on Formal Methods and Models for Codesign
Video-rate stereo depth measurement on programmable hardware
CVPR'03 Proceedings of the 2003 IEEE computer society conference on Computer vision and pattern recognition
Fixed-outline floorplanning: enabling hierarchical design
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Faster maximum and minimum mean cycle algorithms for system-performance analysis
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
System-level design: orthogonalization of concerns and platform-based design
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
Theory of latency-insensitive design
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
Generic ILP-based approaches for time-multiplexed FPGA partitioning
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
Throughput-driven floorplanning with wire pipelining
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
Performance analysis of latency-insensitive systems
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
A New Multilevel Framework for Large-Scale Interconnect-Driven Floorplanning
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
Hi-index | 0.00 |
We present a throughput-driven partitioning algorithm and a throughput-preserving merging algorithm for the high-level physical synthesis of latency-insensitive (LI) systems. These two algorithms are integrated along with a published floorplanner [5] in a new iterative physical synthesis flow to optimize system throughput and reduce area occupation. The partitioning algorithm performs bottom-up clustering of the internal logic of a given IP core to divide it into smaller ones, each of which has no combinational path from input to output and thus is legal for LI-interface encapsulation. Applying this algorithm to cores on critical feedback loops optimizes their length and in turn enables throughput optimization via the subsequent floorplanning. The merging algorithm reduces the number of cores on non-critical loops, lowering the overall area taken by LI interfaces without hurting the system throughput. Experimental results on a large system-on-chip design show a 16.7% speedup in system throughput and a 2.1% reduction in area occupation.