The Organization of Computations for Uniform Recurrence Equations
Journal of the ACM (JACM)
Systolic algorithms to examine all pairs of elements
Communications of the ACM
A hardware accelerator for speech recognition algorithms
ISCA '86 Proceedings of the 13th annual international symposium on Computer architecture
Optimal Systolic Design for the Transitive Closure and the Shortest Path Problems
IEEE Transactions on Computers
Synthesizing Linear Array Algorithms from Nested FOR Loop Algorithms
IEEE Transactions on Computers
Givens elimination on systolic arrays
ICS '88 Proceedings of the 2nd international conference on Supercomputing
On high-speed computing with a programmable linear array
Proceedings of the 1988 ACM/IEEE conference on Supercomputing
A Systolic Architecture for Fast Dense Matrix Inversion
IEEE Transactions on Computers
On Mapping Algorithms to Linear and Fault-Tolerant Systolic Arrays
IEEE Transactions on Computers
Compiler optimizations for asynchronous systolic array programs
POPL '88 Proceedings of the 15th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Systematic hardware adaptation of systolic algorithms
ISCA '89 Proceedings of the 16th annual international symposium on Computer architecture
Compiling programs for a linear systolic array
PLDI '90 Proceedings of the ACM SIGPLAN 1990 conference on Programming language design and implementation
Preconditioning index set transformations for time-optimal affine scheduling
SPAA '90 Proceedings of the second annual ACM symposium on Parallel algorithms and architectures
A framework for high level synthesis of digital architectures from u-recursive algorithms
CSC '90 Proceedings of the 1990 ACM annual conference on Cooperation
Time Optimal Linear Schedules for Algorithms with Uniform Dependencies
IEEE Transactions on Computers
On Synthesizing Optimal Family of Linear Systolic Arrays for Matrix Multiplication
IEEE Transactions on Computers
Detecting static algorithms by partial evaluation
PEPM '91 Proceedings of the 1991 ACM SIGPLAN symposium on Partial evaluation and semantics-based program manipulation
Optimization of Computation Time for Systolic Arrays
IEEE Transactions on Computers
Analysis of free schedule in periodic graphs
SPAA '92 Proceedings of the fourth annual ACM symposium on Parallel algorithms and architectures
Synthesis aspects in the design of efficient processor arrays from affine recurrence equations
Journal of Symbolic Computation - Special issue on automatic programming
Some New Designs of 2-D Array for Matrix Multiplication and Transitive Closure
IEEE Transactions on Parallel and Distributed Systems
Thoughts on parallelism and concurrency in compiling curricula
ACM Computing Surveys (CSUR)
An Approach to Designing Modular Extensible Linear Arrays for Regular Algorithms
IEEE Transactions on Computers
A Unifying Lattice-Based Approach for the Partitioning of Systolic Arrays via LPGS and LSGP
Journal of VLSI Signal Processing Systems
DECOMPOSER: a synthesizer for systolic systems
DAC '88 Proceedings of the 25th ACM/IEEE Design Automation Conference
MUPPET—a programming environment of message-based multiprocessors
ACM '86 Proceedings of 1986 ACM Fall joint computer conference
Combined instruction and loop parallelism in array synthesis for FPGAs
Proceedings of the 14th international symposium on Systems synthesis
Systolic Opportunities for Multidimensional Data Streams
IEEE Transactions on Parallel and Distributed Systems
Configuring of Algorithms in Mapping into Hardware
The Journal of Supercomputing
A Family of New Efficient Arrays for Matrix Multiplication
IEEE Transactions on Computers
The Generation of a Class of Multipliers: Synthesizing Highly Parallel Algorithms in VLSI
IEEE Transactions on Computers
Design of Space-Optimal Regular Arrays for Algorithms with Linear Schedules
IEEE Transactions on Computers
On Mapping Systolic Algorithms onto the Hypercube
IEEE Transactions on Parallel and Distributed Systems
Mapping Nested Loop Algorithms into Multidimensional Systolic Arrays
IEEE Transactions on Parallel and Distributed Systems
Uniform Approach for Solving some Classical Problems on a Linear Array
IEEE Transactions on Parallel and Distributed Systems
A Loop Transformation Theory and an Algorithm to Maximize Parallelism
IEEE Transactions on Parallel and Distributed Systems
A Processor-Time-Minimal Systolic Array for Cubical Mesh Algorithms
IEEE Transactions on Parallel and Distributed Systems
A Processor-Time-Minimal Systolic Array for Transitive Closure
IEEE Transactions on Parallel and Distributed Systems
On Time Mapping of Uniform Dependence Algorithms into Lower Dimensional Processor Arrays
IEEE Transactions on Parallel and Distributed Systems
A General Methodology of Partitioning and Mapping for Given Regular Arrays
IEEE Transactions on Parallel and Distributed Systems
Mapping Linear Recurrences onto Systolic Arrays
IPPS '96 Proceedings of the 10th International Parallel Processing Symposium
Loop Tiling for Reconfigurable Accelerators
FPL '01 Proceedings of the 11th International Conference on Field-Programmable Logic and Applications
An introduction to processor-time-optimal systolic arrays
Highly parallel computaions
Hyper-systolic algorithms with applications in linear algebra and molecular dynamics
Highly parallel computaions
Determination of the Processor Functionality in the Design of Processor Arrays
ASAP '97 Proceedings of the IEEE International Conference on Application-Specific Systems, Architectures and Processors
Affine transformations for multi-dimensional signal processing on ASIC regular arrays
EURO-DAC '91 Proceedings of the conference on European design automation
Mapping rectangular mesh algorithms onto asymptotically space-optimal arrays
Journal of Parallel and Distributed Computing
Computing transitive closure on systolic arrays of fixed size
Distributed Computing
Towards systolizing compilation
Distributed Computing
Automatic mapping of nested loops to FPGAS
Proceedings of the 12th ACM SIGPLAN symposium on Principles and practice of parallel programming
A practical dynamic single assignment transformation
ACM Transactions on Design Automation of Electronic Systems (TODAES)
Mapping Homogeneous Graphs on Linear Arrays
IEEE Transactions on Computers
Journal of Parallel and Distributed Computing
Parallel image processing with the block data parallel architecture
IBM Journal of Research and Development
Hardware Acceleration of HMMER on FPGAs
Journal of Signal Processing Systems
Acceleration of a content-based image-retrieval application on the RDISK cluster
IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
Transformation to dynamic single assignment using a simple data flow analysis
APLAS'05 Proceedings of the Third Asian conference on Programming Languages and Systems
Hi-index | 0.06 |
We describe a systematic method for the design of systolic arrays. This method may be used for algorithms that can be expressed as a set of uniform recurrent equations over a convex set D of Cartesian coordinates. Most of the algorithms already considered for systolic implementation may be represented in this way. The methods consists of two steps: finding a timing-function for the computations that is compatible with the dependences introduced by the equations, then mapping the domain D onto another finite set of coordinates, each representing a processor of the systolic array, in such a way that concurrent computations are mapped onto different processors. The scheduling and mapping functions meet conditions that allow the full automation of the method. The method is exemplified on the convolution product and the matrix product.