Cluster-aware iterative improvement techniques for partitioning large VLSI circuits

Authors:
Shantanu Dutt;Wenyong Deng
Affiliations:
University of Illinois-Chicago, Chicago, IL;Cadence Design Systems, San Jose, CA
Venue:
ACM Transactions on Design Automation of Electronic Systems (TODAES)
Year:
2002

Citing 15
Cited 5

Cost minimization of partitions into multiple devices

DAC '93 Proceedings of the 30th international Design Automation Conference
A general framework for vertex orderings, with applications to netlist clustering

ICCAD '94 Proceedings of the 1994 IEEE/ACM international conference on Computer-aided design
Partitioning very large circuits using analytical placement techniques

DAC '94 Proceedings of the 31st annual Design Automation Conference
Spectral partitioning: the more eigenvectors, the better

DAC '95 Proceedings of the 32nd annual ACM/IEEE Design Automation Conference
Linear decomposition algorithm for VLSI design applications

ICCAD '95 Proceedings of the 1995 IEEE/ACM international conference on Computer-aided design
A probability-based approach to VLSI circuit partitioning

DAC '96 Proceedings of the 33rd annual Design Automation Conference
VLSI circuit partitioning by cluster-removal using iterative improvement techniques

Proceedings of the 1996 IEEE/ACM international conference on Computer-aided design
Multilevel hypergraph partitioning: application in VLSI domain

DAC '97 Proceedings of the 34th annual Design Automation Conference
Multilevel circuit partitioning

DAC '97 Proceedings of the 34th annual Design Automation Conference
Partitioning around roadblocks: tackling constraints with intermediate relaxations

ICCAD '97 Proceedings of the 1997 IEEE/ACM international conference on Computer-aided design
Large scale circuit partitioning with loose/stable net removal and signal flow based clustering

ICCAD '97 Proceedings of the 1997 IEEE/ACM international conference on Computer-aided design
A Fast and Robust Network Bisection Algorithm

IEEE Transactions on Computers
A proper model for the partitioning of electrical circuits

DAC '72 Proceedings of the 9th Design Automation Workshop
A linear-time heuristic for improving network partitions

DAC '82 Proceedings of the 19th Design Automation Conference
An evaluation of bipartitioning techniques

IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems

DVS: An Object-Oriented Framework for Distributed Verilog Simulation

Proceedings of the seventeenth workshop on Parallel and distributed simulation
A Design-Driven Partitioning Algorithm for Distributed Verilog Simulation

Proceedings of the 21st International Workshop on Principles of Advanced and Distributed Simulation
A Multiway Design-driven Partitioning Algorithm for Distributed Verilog Simulation

Simulation
Scalable graph clustering using stochastic flows: applications to community discovery

Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Graph clustering based on optimization of a macroscopic structure of clusters

DS'11 Proceedings of the 14th international conference on Discovery science

Quantified Score

Hi-index	0.00

Visualization

Abstract

Move-based iterative improvement partitioning (IIP) methods, such as the Fiduccia-Mattheyses (FM) algorithm [Fidducia and Mattheyses 1982] and Krishnamurthy's Look-Ahead (LA) algorithm [Krishnamurthy 1984], are widely used in VLSI CAD applications, largely due to their time efficiency and ease of implementation. This class of algorithms is of the "local/greedy improvement" type, and they generate relatively high-quality results for small and medium-size circuits. However, as VLSI circuits become larger, these algorithms suffer a rapid deterioration in solution quality. We propose new IIP methods CLIP and CDIP that select cells to move with a view to moving clusters that straddle the two subsets of a partition, into one of the subsets. The new algorithms significantly improve partition quality while preserving the advantage of time efficiency. Experimental results on 25 medium to large-size ACM/SIGDA benchmark circuits show up to 70% improvement over FM in mincut, and average mincut improvements of about 35% over all circuits and 47% over large circuits. They also outperform state-of-the-art non-IIP techniques, the quadratic-programming-based method Paraboli [Reiss et al. 1994] and the spectral partitioner MELO [Alpert and Yao 1995], by about 17% and 23%, respectively, with less CPU time. This demonstrates the potential of sophisticated IIP algorithms in dealing with the increasing complexity of emerging VLSI circuits. We also compare CLIP and CDIP to hMetis [Karypis et al. 1997], one of the best of the recent state-of-the-art partitioners that are based on the multilevel paradigm (others include MLc [Alpert et al. 1997] and LSR/MFFS [Cong et al. 1997]). The results show that one scheme of hMetis is 8% worse than CLIP/CDIP and the other two schemes are only 2--4% better. However, CLIP/CDIP have advantages over hMetis and other multilevel partitioners that outweigh these minimal mincut improvements. The first is much faster times-to-solution (for example, one of our best schemes CLIP-LA2 is 6.4 and 11.75 times faster than the two best hMetis schemes) and much better scalability with circuit size (e.g., for the largest circuit with about 162K nodes, CLIP-LA2 is 10.4 and and 21.5 times faster and obtains better solution qualities than the two best hMetis schemes). Second, CLIP/CDIP are "flat" partitioners, while multilevel techniques perform a sequence of node clustering/coarsening before partitioning the circuit. In complex placement applications such as timing-driven placement in the presence of multiple constraints, such circuit coarsening can hide crucial information needed for good-quality solutions, thus making the partitioning process oblivious to them. This, however, is not a problem with flat partitioners like CLIP/CDIP that can take all important parameters into account while partitioning. All these advantages make CLIP/CDIP suitable for use in complex physical design problems for large, deep-submicron VLSI circuits.