Exploiting local logic structures to optimize multi-core SoC floorplanning

Authors:
Cheng-Hong Li;Sampada Sonalkar;Luca P. Carloni
Affiliations:
Columbia University in the City of New York;Columbia University in the City of New York;Columbia University in the City of New York
Venue:
Proceedings of the Conference on Design, Automation and Test in Europe
Year:
2010

Citing 24
Cited 0

Acyclic multi-way partitioning of Boolean networks

DAC '94 Proceedings of the 31st annual Design Automation Conference
Network flow based circuit partitioning for time-multiplexed FPGAs

Proceedings of the 1998 IEEE/ACM international conference on Computer-aided design
Performance analysis and optimization of latency insensitive systems

Proceedings of the 37th Annual Design Automation Conference
Coping with Latency in SOC Design

IEEE Micro
Local unidirectional bias for smooth cutsize-delay tradeoff in performance-driven bipartitioning

Proceedings of the 2003 international symposium on Physical design
Profile-guided microarchitectural floorplanning for deep submicron processor design

Proceedings of the 41st annual Design Automation Conference
Microarchitecture-aware floorplanning using a statistical design of experiments approach

Proceedings of the 42nd annual Design Automation Conference
Processing Rate Optimization by Sequential System Floorplanning

ISQED '06 Proceedings of the 7th International Symposium on Quality Electronic Design
Synthesis of synchronous elastic architectures

Proceedings of the 43rd annual Design Automation Conference
Microarchitecture configurations and floorplanning co-optimization

IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Design, Implementation, and Validation of a New Class of Interface Circuits for Latency-Insensitive Design

MEMOCODE '07 Proceedings of the 5th IEEE/ACM International Conference on Formal Methods and Models for Codesign
Correct-by-construction microarchitectural pipelining

Proceedings of the 2008 IEEE/ACM International Conference on Computer-Aided Design
Clustering for processing rate optimization

IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Bounded dataflow networks and latency-insensitive circuits

MEMOCODE'09 Proceedings of the 7th IEEE/ACM international conference on Formal Methods and Models for Codesign
Video-rate stereo depth measurement on programmable hardware

CVPR'03 Proceedings of the 2003 IEEE computer society conference on Computer vision and pattern recognition
Fixed-outline floorplanning: enabling hierarchical design

IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Faster maximum and minimum mean cycle algorithms for system-performance analysis

IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
System-level design: orthogonalization of concerns and platform-based design

IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
Theory of latency-insensitive design

IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
Generic ILP-based approaches for time-multiplexed FPGA partitioning

IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
Throughput-driven floorplanning with wire pipelining

IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
Performance analysis of latency-insensitive systems

IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
Min-cut floorplacement

IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
A New Multilevel Framework for Large-Scale Interconnect-Driven Floorplanning

IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

We present a throughput-driven partitioning algorithm and a throughput-preserving merging algorithm for the high-level physical synthesis of latency-insensitive (LI) systems. These two algorithms are integrated along with a published floorplanner [5] in a new iterative physical synthesis flow to optimize system throughput and reduce area occupation. The partitioning algorithm performs bottom-up clustering of the internal logic of a given IP core to divide it into smaller ones, each of which has no combinational path from input to output and thus is legal for LI-interface encapsulation. Applying this algorithm to cores on critical feedback loops optimizes their length and in turn enables throughput optimization via the subsequent floorplanning. The merging algorithm reduces the number of cores on non-critical loops, lowering the overall area taken by LI interfaces without hurting the system throughput. Experimental results on a large system-on-chip design show a 16.7% speedup in system throughput and a 2.1% reduction in area occupation.