Benchmarks for cell-based layout systems
DAC '87 Proceedings of the 24th ACM/IEEE Design Automation Conference
A language for describing rectilinear Steiner tree configurations
DAC '86 Proceedings of the 23rd ACM/IEEE Design Automation Conference
A parallel bit map processor architecture for DA algorithms
DAC '81 Proceedings of the 18th Design Automation Conference
The parallel decomposition and implementation of an integrated circuit global router
PPEALS '88 Proceedings of the ACM/SIGPLAN conference on Parallel programming: experience with applications, languages and systems
Characterizing the synchronization behavior of parallel programs
PPEALS '88 Proceedings of the ACM/SIGPLAN conference on Parallel programming: experience with applications, languages and systems
Exploiting variable grain parallelism at runtime
PPEALS '88 Proceedings of the ACM/SIGPLAN conference on Parallel programming: experience with applications, languages and systems
Analysis of cache invalidation patterns in multiprocessors
ASPLOS III Proceedings of the third international conference on Architectural support for programming languages and operating systems
Process control and scheduling issues for multiprogrammed shared-memory multiprocessors
SOSP '89 Proceedings of the twelfth ACM symposium on Operating systems principles
ISCA '89 Proceedings of the 16th annual international symposium on Computer architecture
A new placement level wirability estimate with measurements
ACM SIGDA Newsletter
Coarse-grain parallel programming in Jade
PPOPP '91 Proceedings of the third ACM SIGPLAN symposium on Principles and practice of parallel programming
Modeling the performance of limited pointers directories for cache coherence
ISCA '91 Proceedings of the 18th annual international symposium on Computer architecture
PHIGURE: a parallel hierarchical global router
DAC '90 Proceedings of the 27th ACM/IEEE Design Automation Conference
SPLASH: Stanford parallel applications for shared-memory
ACM SIGARCH Computer Architecture News
Hiding memory latency using dynamic scheduling in shared-memory multiprocessors
ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
Cache Invalidation Patterns in Shared-Memory Multiprocessors
IEEE Transactions on Computers
POPL '92 Proceedings of the 19th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
PADS '93 Proceedings of the seventh workshop on Parallel and distributed simulation
A new generalized row-based global router
ICCAD '93 Proceedings of the 1993 IEEE/ACM international conference on Computer-aided design
A Transformation Approach to Derive Efficient Parallel Implementations
IEEE Transactions on Software Engineering - Special issue on architecture-independent languages and software tools parallel processing
False Sharing and Spatial Locality in Multiprocessor Caches
IEEE Transactions on Computers
Parallel Global Routing Algorithms for Standard Cells
IPPS '97 Proceedings of the 11th International Symposium on Parallel Processing
A global router for sea-of-gates circuits
EURO-DAC '91 Proceedings of the conference on European design automation
Kendo: efficient deterministic multithreading in software
Proceedings of the 14th international conference on Architectural support for programming languages and operating systems
Generating synchronization statements in divide-and-conquer programs
Parallel Computing
Hi-index | 0.01 |
A fast and easily parallelizable global routing algorithm for standard cells and its parallel implementation is presented. LocusRoute is meant to be used as the cost function for a placement algorithm and so this context constrains the structure of the global routing algorithm and its parallel implementation. The router is based on enumerating a subset of all two-bend routes between two points, and results in 16% to 37% fewer total number of tracks than the TimberWolf global router for standard cells [Sech85]. It is comparable in quality to a maze router and an industrial router, but is factor of 10 times or more faster. Three approaches to parallelizing the router are implemented: wire-by-wire parallelism, segment-by-segment and route-by-route. Two of these approaches achieve significant speedup - route-by-route achieves up to 4.6 using eight processors, and wire-by-wire achieves from 5.8 to 7.6 on eight processors.