Understanding the efficiency of GPU algorithms for matrix-matrix multiplication
Proceedings of the ACM SIGGRAPH/EUROGRAPHICS conference on Graphics hardware
Shader Performance Analysis on a Modern GPU Architecture
Proceedings of the 38th annual IEEE/ACM International Symposium on Microarchitecture
The design space of data-parallel memory systems
Proceedings of the 2006 ACM/IEEE conference on Supercomputing
A memory model for scientific algorithms on graphics processors
Proceedings of the 2006 ACM/IEEE conference on Supercomputing
Transactions on High-Performance Embedded Architectures and Compilers IV
Hi-index | 0.00 |
Memory access patterns common in video processing algorithms, which are unsuited to the GPU (Graphics Processing Unit) memory system, are identified. We develop REDA (Reconfigurable Engine for Data Access) to improve GPU performance for such access patterns, by employing reconfigurable logic for address mapping. It is shown that a sixty times reduction in number of video memory accesses can be achieved for previously unsuited access patterns, with no detriment to well suited patterns. Surprisingly, memory access locality is also improved.