Partitioning and Mapping Algorithms into Fixed Size Systolic Arrays
IEEE Transactions on Computers
Programmable active memories: reconfigurable systems come of age
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Selecting tile shape for minimal execution time
Proceedings of the eleventh annual ACM symposium on Parallel algorithms and architectures
Scheduling of Partitioned Regular Algorithms on Processor Arrays with Constrained Resources
ASAP '96 Proceedings of the IEEE International Conference on Application-Specific Systems, Architectures, and Processors
Automatic synthesis of systolic arrays from uniform recurrent equations
ISCA '84 Proceedings of the 11th annual international symposium on Computer architecture
Two-dimensional orthogonal tiling: from theory to practice
HIPC '96 Proceedings of the Third International Conference on High-Performance Computing (HiPC '96)
Optimal Partitioning for FPGA Based Regular Array Implementations
PARELEC '00 Proceedings of the International Conference on Parallel Computing in Electrical Engineering
PPMC: a programmable pattern based memory controller
ARC'12 Proceedings of the 8th international conference on Reconfigurable Computing: architectures, tools and applications
Hi-index | 0.00 |
In this paper, we focus on system level-optimizations for automatic parallelization of nested loop on Reconfigurable Accelerators. Specifically, as off-chip bandwidth plays a major role in total performances for such implementations, we propose some partitioning techniques based on loop tiling which can take advantage of the hierarchically structured RA memory systems.