Strategies for cache and local memory management by global program transformation
Journal of Parallel and Distributed Computing - Special Issue on Languages, Compilers and environments for Parallel Programming
Effective jump-pointer prefetching for linked data structures
ISCA '99 Proceedings of the 26th annual international symposium on Computer architecture
Loop Tiling for Reconfigurable Accelerators
FPL '01 Proceedings of the 11th International Conference on Field-Programmable Logic and Applications
Impulse: Building a Smarter Memory Controller
HPCA '99 Proceedings of the 5th International Symposium on High Performance Computer Architecture
RECONFIG '09 Proceedings of the 2009 International Conference on Reconfigurable Computing and FPGAs
A template system for the efficient compilation of domain abstractions onto reconfigurable computers
Journal of Systems Architecture: the EUROMICRO Journal
Hi-index | 0.00 |
One of the main challenges in the design of hardware accelerators is the efficient access of data from the external memory. Improving and optimizing the functionality of the memory controller between the external memory and the accelerators is therefore critical. In this paper, we advance toward this goal by proposing PPMC, the Programmable Pattern-based Memory Controller. This controller supports scatter-gather and strided 1D, 2D and 3D accesses with programmable tiling. Compared to existing solutions, the proposed system provides better performance, simplifies programming access patterns and eases software integration by interfacing to high-level programming languages. In addition, the controller offers an interface for automating domain decomposition via tiling. We implemented and tested PPMC on a Xilinx ML505 evaluation board using a MicroBlaze soft-core as the host processor. The evaluation uses six memory intensive application kernels: Laplacian solver, FIR, FFT, Thresholding, Matrix Multiplication, and 3D-Stencil. The results show that the PPMC-enhanced system achieves at least 10x speed-ups for 1D, 2D and 3D memory accesses as compared to a non-PPMC based setup.