Partitioning and Mapping Algorithms into Fixed Size Systolic Arrays
IEEE Transactions on Computers
VLSI array processors
Systematic algorithm mapping for multidimensional systolic arrays
Journal of Parallel and Distributed Computing
A systolic array parallelizing compiler
Journal of Parallel and Distributed Computing - Special issue: software tools for parallel programming and visualization
Computer graphics: principles and practice (2nd ed.)
Computer graphics: principles and practice (2nd ed.)
MSSM—a design aid for multi-stage systolic mapping
Journal of VLSI Signal Processing Systems - Special issue: 1990 Workshop on VLSI signal processing
Introduction to parallel computing: design and analysis of algorithms
Introduction to parallel computing: design and analysis of algorithms
Cache tiling for high performance morphological image processing
Machine Vision and Applications
The Organization of Computations for Uniform Recurrence Equations
Journal of the ACM (JACM)
A Systolic Array Optimizing Compiler
A Systolic Array Optimizing Compiler
Digital Camera System on a Chip
IEEE Micro
Cost and Time-Cost Effectiveness of Multiprocessing
IEEE Transactions on Parallel and Distributed Systems
VLSI Architecture: Past, Present, and Future
ARVLSI '99 Proceedings of the 20th Anniversary Conference on Advanced Research in VLSI
Automatic synthesis of systolic arrays from uniform recurrent equations
ISCA '84 Proceedings of the 11th annual international symposium on Computer architecture
Derivation, extensions and parallel implementation of regular iterative algorithms
Derivation, extensions and parallel implementation of regular iterative algorithms
Real time image processing on parallel arrays for gigascale integration
Real time image processing on parallel arrays for gigascale integration
The Journal of Supercomputing
Compact FPGA-based systolic array architecture suitable for vision systems
International Journal of High Performance Systems Architecture
Hi-index | 0.00 |
Portable image processing applications require an efficient, scalable platform with localized computing regions. This paper presents a new class of area I/O systolic architecture to exploit the physical data locality of planar data streams by processing data where it falls. A synthesis technique using dependence graphs, data partitioning, and computation mapping is developed to handle planar data streams and to systematically design arrays with area I/O. Simulation results show that the use of area I/O provides a 16 times speedup over systems with perimeter I/O. Performance comparisons for a set of signal processing algorithms show that systolic arrays that consider planar data streams in the design process are up to three times faster than traditional arrays.