LocusRoute: a parallel global router for standard cells

Authors:
Jonathan Rose
Affiliations:
Computer Systems Laboratory, Center for Integrated Systems, Stanford University, Stanford CA
Venue:
DAC '88 Proceedings of the 25th ACM/IEEE Design Automation Conference
Year:
1988

Citing 3
Cited 22

Benchmarks for cell-based layout systems

DAC '87 Proceedings of the 24th ACM/IEEE Design Automation Conference
A language for describing rectilinear Steiner tree configurations

DAC '86 Proceedings of the 23rd ACM/IEEE Design Automation Conference
A parallel bit map processor architecture for DA algorithms

DAC '81 Proceedings of the 18th Design Automation Conference

The parallel decomposition and implementation of an integrated circuit global router

PPEALS '88 Proceedings of the ACM/SIGPLAN conference on Parallel programming: experience with applications, languages and systems
Characterizing the synchronization behavior of parallel programs

PPEALS '88 Proceedings of the ACM/SIGPLAN conference on Parallel programming: experience with applications, languages and systems
Exploiting variable grain parallelism at runtime

PPEALS '88 Proceedings of the ACM/SIGPLAN conference on Parallel programming: experience with applications, languages and systems
Analysis of cache invalidation patterns in multiprocessors

ASPLOS III Proceedings of the third international conference on Architectural support for programming languages and operating systems
Process control and scheduling issues for multiprogrammed shared-memory multiprocessors

SOSP '89 Proceedings of the twelfth ACM symposium on Operating systems principles
Exploring the benefits of multiple hardware contexts in a multiprocessor architecture: preliminary results

ISCA '89 Proceedings of the 16th annual international symposium on Computer architecture
A new placement level wirability estimate with measurements

ACM SIGDA Newsletter
Coarse-grain parallel programming in Jade

PPOPP '91 Proceedings of the third ACM SIGPLAN symposium on Principles and practice of parallel programming
Modeling the performance of limited pointers directories for cache coherence

ISCA '91 Proceedings of the 18th annual international symposium on Computer architecture
PHIGURE: a parallel hierarchical global router

DAC '90 Proceedings of the 27th ACM/IEEE Design Automation Conference
SPLASH: Stanford parallel applications for shared-memory

ACM SIGARCH Computer Architecture News
Hiding memory latency using dynamic scheduling in shared-memory multiprocessors

ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
Cache Invalidation Patterns in Shared-Memory Multiprocessors

IEEE Transactions on Computers
Semantic foundations of Jade

POPL '92 Proceedings of the 19th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Restructuring a parallel simulation to improve cache behavior in a shared-memory multiprocessor: the value of distributed synchronization

PADS '93 Proceedings of the seventh workshop on Parallel and distributed simulation
A new generalized row-based global router

ICCAD '93 Proceedings of the 1993 IEEE/ACM international conference on Computer-aided design
A Transformation Approach to Derive Efficient Parallel Implementations

IEEE Transactions on Software Engineering - Special issue on architecture-independent languages and software tools parallel processing
False Sharing and Spatial Locality in Multiprocessor Caches

IEEE Transactions on Computers
Parallel Global Routing Algorithms for Standard Cells

IPPS '97 Proceedings of the 11th International Symposium on Parallel Processing
A global router for sea-of-gates circuits

EURO-DAC '91 Proceedings of the conference on European design automation
Kendo: efficient deterministic multithreading in software

Proceedings of the 14th international conference on Architectural support for programming languages and operating systems
Generating synchronization statements in divide-and-conquer programs

Parallel Computing

Quantified Score

Hi-index	0.01

Visualization

Abstract

A fast and easily parallelizable global routing algorithm for standard cells and its parallel implementation is presented. LocusRoute is meant to be used as the cost function for a placement algorithm and so this context constrains the structure of the global routing algorithm and its parallel implementation. The router is based on enumerating a subset of all two-bend routes between two points, and results in 16% to 37% fewer total number of tracks than the TimberWolf global router for standard cells [Sech85]. It is comparable in quality to a maze router and an industrial router, but is factor of 10 times or more faster. Three approaches to parallelizing the router are implemented: wire-by-wire parallelism, segment-by-segment and route-by-route. Two of these approaches achieve significant speedup - route-by-route achieves up to 4.6 using eight processors, and wire-by-wire achieves from 5.8 to 7.6 on eight processors.