A design methodology for synthesizing parallel algorithms and architectures
Journal of Parallel and Distributed Computing
The Organization of Computations for Uniform Recurrence Equations
Journal of the ACM (JACM)
Introduction to Mathematical Theory of Computation
Introduction to Mathematical Theory of Computation
Introduction to VLSI Systems
Automatic synthesis of systolic arrays from uniform recurrent equations
ISCA '84 Proceedings of the 11th annual international symposium on Computer architecture
Space-time algorithms: semantics and methodology (crystal)
Space-time algorithms: semantics and methodology (crystal)
Synthesizing Linear Array Algorithms from Nested FOR Loop Algorithms
IEEE Transactions on Computers
On high-speed computing with a programmable linear array
Proceedings of the 1988 ACM/IEEE conference on Supercomputing
Architectural synthesis of performance-driven multipliers with accumulator interleaving
DAC '93 Proceedings of the 30th international Design Automation Conference
DECOMPOSER: a synthesizer for systolic systems
DAC '88 Proceedings of the 25th ACM/IEEE Design Automation Conference
Automatic data and computation decomposition on distributed memory parallel computers
ACM Transactions on Programming Languages and Systems (TOPLAS)
Mapping Nested Loop Algorithms into Multidimensional Systolic Arrays
IEEE Transactions on Parallel and Distributed Systems
Partitioning and Mapping Nested Loops on Multiprocessor Systems
IEEE Transactions on Parallel and Distributed Systems
A Processor-Time-Minimal Systolic Array for Cubical Mesh Algorithms
IEEE Transactions on Parallel and Distributed Systems
A Processor-Time-Minimal Systolic Array for Transitive Closure
IEEE Transactions on Parallel and Distributed Systems
Hi-index | 14.98 |
A synthesis method for designing highly parallel algorithms in VLSI is presented. To illustrate the method, the familiar long multiplication algorithm for binary numbers is used. This algorithm is specified in the language Crystal, a very-high-level language for parallel processing. A total of 18 designs are derived from this specification. Each is optimal within its own class, which is characterized by a space-time map. The relative merits and tradeoffs of different designs are systematically compared and evaluated.