An algorithm for routing with capacitance/distance constraints for clock distribution in microprocessors

Authors:
Rupesh S. Shelar
Affiliations:
Intel Corporation, Hillsboro, OR, USA
Venue:
Proceedings of the 2009 international symposium on Physical design
Year:
2009

Citing 15
Cited 7

A clustering-based optimization algorithm in zero-skew routings

DAC '93 Proceedings of the 30th international Design Automation Conference
An efficient zero-skew routing algorithm

DAC '94 Proceedings of the 31st annual Design Automation Conference
Power optimal buffered clock tree design

DAC '95 Proceedings of the 32nd annual ACM/IEEE Design Automation Conference
Sizing of clock distribution networks for high performance CPU chips

DAC '96 Proceedings of the 33rd annual Design Automation Conference
Multi-GHz interconnect effects in microprocessors

Proceedings of the 2001 international symposium on Physical design
Computers and Intractability: A Guide to the Theory of NP-Completeness

Computers and Intractability: A Guide to the Theory of NP-Completeness
Clustering and load balancing for buffered clock tree synthesis

ICCD '97 Proceedings of the 1997 International Conference on Computer Design (ICCD '97)
Reducing clock skew variability via cross links

Proceedings of the 41st annual Design Automation Conference
Clock Scheduling and Clocktree Construction for High Performance ASICS

Proceedings of the 2003 IEEE/ACM international conference on Computer-aided design
Variation tolerant buffered clock network synthesis with cross links

Proceedings of the 2006 international symposium on Physical design
Practical techniques to reduce skew and its variations in buffered clock networks

ICCAD '05 Proceedings of the 2005 IEEE/ACM International conference on Computer-aided design
An efficent clustering algorithm for low power clock tree synthesis

Proceedings of the 2007 international symposium on Physical design
A Global Minimum Clock Distribution Network Augmentation Algorithm for Guaranteed Clock Skew Yield

ASP-DAC '07 Proceedings of the 2007 Asia and South Pacific Design Automation Conference
Approximation algorithms for a facility location problem with service capacities

ACM Transactions on Algorithms (TALG)
Approximation algorithms for network design and facility location with service capacities

APPROX'05/RANDOM'05 Proceedings of the 8th international workshop on Approximation, Randomization and Combinatorial Optimization Problems, and Proceedings of the 9th international conference on Randamization and Computation: algorithms and techniques

Contango: integrated optimization of SoC clock networks

Proceedings of the Conference on Design, Automation and Test in Europe
Grid-to-ports clock routing for high performance microprocessor designs

Proceedings of the 2011 international symposium on Physical design
Timing slack aware incremental register placement with non-uniform grid generation for clock mesh synthesis

Proceedings of the 2011 international symposium on Physical design
Algorithmic tuning of clock trees and derived non-tree structures

Proceedings of the International Conference on Computer-Aided Design
Low-power clock trees for CPUs

Proceedings of the International Conference on Computer-Aided Design
Clock mesh synthesis with gated local trees and activity driven register clustering

Proceedings of the International Conference on Computer-Aided Design
Revisiting automated physical synthesis of high-performance clock networks

ACM Transactions on Design Automation of Electronic Systems (TODAES)

Quantified Score

Hi-index	0.00

Visualization

Abstract

In modern microprocessors, clocks are usually distributed employing a hybrid network, grid followed by buffered trees, to restrict the skew. This is typically done employing (gated) buffered trees inside the blocks, while the global grid overlay the entire die area. The block-level buffered trees are connected to the grid at specific locations, by routing the wires along the predetermined tracks. The routing of these clock wires, which consume noticeable power, have distance and capacitance constraints to avoid poor slopes at the inputs of the block-level buffers. Moreover, these wires also contribute to significant load on the clock grid. This leads to a problem of capacitance or wirelength minimization during the multi-terminal routing such that wires use pre-specified tracks and routes obey distance and capacitance constraints, i.e., the length of the route from any receiver to a connection on the grid-wire has less than the specified distance and the overall capacitance due to all receivers on the route is less than the given limit. Since the problem is intractable, we present an efficient algorithm that completes the routing connecting 1000s of terminals over a few $mm^2$ area in seconds, improving the wirelength by 17% over the commonly used nearest source heuristic. The algorithm is employed to perform post-grid clock distribution in a 45 nm technology microprocessor.