Array regrouping and structure splitting using whole-program reference affinity
Proceedings of the ACM SIGPLAN 2004 conference on Programming language design and implementation
On-the-fly elimination of dynamic irregularities for GPU computing
Proceedings of the sixteenth international conference on Architectural support for programming languages and operating systems
Hi-index | 0.00 |
The performance of Graphic Processing Units (GPU) is sensitive to irregular memory references. A recent study shows the promise of eliminating irregular references through runtime thread-data remapping. However, how to efficiently determine the optimal mapping is yet an open question. This paper presents some initial exploration to the question, especially in the dimension of data layout optimization. It describes three algorithms to compute or approximate optimal data layouts for GPU. These algorithms exhibit a spectrum of tradeoff among the space cost, time cost, and quality of the resulting data layouts.