A parallel circuit-partitioned algorithm for timing-driven standard cell placement
Journal of Parallel and Distributed Computing
Simultaneous driver sizing and buffer insertion using a delay penalty estimation technique
Proceedings of the 2002 international symposium on Physical design
Computers and Intractability: A Guide to the Theory of NP-Completeness
Computers and Intractability: A Guide to the Theory of NP-Completeness
Parallel Standard Cell Placement on a Cluster of Workstations
CLUSTER '01 Proceedings of the 3rd IEEE International Conference on Cluster Computing
Parallel hierarchical global routing for general cell layout
GLSVLSI '95 Proceedings of the Fifth Great Lakes Symposium on VLSI (GLSVLSI'95)
Efficient Partitioning Method For Distributed Logic Simulation of VLSI Circuits
SS '98 Proceedings of the The 31st Annual Simulation Symposium
Overcoming the Serial Logic Simulation Bottleneck in Parallel Fault Simulation
VLSID '97 Proceedings of the Tenth International Conference on VLSI Design: VLSI in Multimedia Applications
A parallel genetic algorithm for performance-driven VLSI routing
IEEE Transactions on Evolutionary Computation
A parallel standard cell placement algorithm
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
A fast algorithm for optimal buffer insertion
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
Hi-index | 0.00 |
This paper presents an efficient modeling scheme and a partitioning heuristic for parallelizing VLSI post-placement timing optimization. Encoding the paths with timing violations into a task graph, our novel modeling scheme provides an efficient representation of the timing and spatial relations among timing optimization tasks. Our new partitioning algorithm then assigns the task graph into multiple sessions of parallel processes, so that interprocessor communication is completely eliminated during each session. This partitioning scheme is especially useful for parallelizing processes with heavily connected tasks and, therefore, high communication requirements. For circuits with 20-130 thousand cells, the partitioning heuristic achieves speedups in excess of 5× without degrading solution quality by dynamically utilizing 1-8 processors.