Optimal partitioners and end-case placers for standard-cell layout

  • Authors:
  • A. E. Caldwell;A. B. Kahng;I. L. Markov

  • Affiliations:
  • Simplex Solutions Inc., Sunnyvale, CA;-;-

  • Venue:
  • IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
  • Year:
  • 2006

Quantified Score

Hi-index 0.03

Visualization

Abstract

We study alternatives to classic Fiduccia-Mattheyses (FM)-based partitioning algorithms in the context of end-case processing for top-down standard-cell placement. While the divide step in the top-down divide and conquer is usually performed heuristically, we observe that optimal solutions can be found for many sufficiently small partitioning instances. Our main motivation is that small partitioning instances frequently contain multiple cells that are larger than the prescribed partitioning tolerance, and that cannot be moved iteratively while preserving the legality of a solution. To sample the suboptimality of FM-based partitioning algorithms, we focus on optimal partitioning and placement algorithms based on either enumeration or branch-and-bound that are invoked for instances below prescribed size thresholds, e.g., <10 cells for placement and <30 cells for partitioning. Such partitioners transparently handle tight balance constraints and uneven cell sizes while typically achieving 40% smaller cuts than best of several FM starts for instances between ten and 50 movable nodes and average degree 2-3. Our branch-and-bound codes incorporate various efficiency improvements, using results for hypergraphs (1993) and a graph-specific algorithm (1996). We achieve considerable speed-ups over single FM starts on such instances on average. Enumeration-based partitioners relying on Gray codes, while easier to implement and taking less time for elementary operations, can only compete with branch-and-bound on very small instances, where optimal placers achieve reasonable performance as well. In the context of a top-down global placer, the right combination of optimal partitioners and placers can achieve up to an average of 10% wirelength reduction and 50% CPU time savings for a set of industry testcases. Our results show that run-time versus quality tradeoffs may be different for small problem instances than for common large benchmarks, resulting in different comparisons of optimization algorithms. We therefore suggest that alternative algorithms be considered and, as an example, present detailed comparisons with the flow-based balanced partitioner heuristic