Massive parallel LDPC decoding on GPU
Proceedings of the 13th ACM SIGPLAN Symposium on Principles and practice of parallel programming
Performance Evaluation of the NVIDIA GeForce 8800 GTX GPU for Machine Learning
ICCS '08 Proceedings of the 8th international conference on Computational Science, Part I
Hi-index | 0.00 |
The paper describes a set of strategies for mapping irregular codes onto commodity graphics hardware. We start identifying the resources that current GPUs contain for solving indirect array accesses entirely on hardware, like vertices, textures and color tables. We then show how multiple indirections can be mapped onto the graphics pipeline, basically taking advantage of its streaming architecture for sequencing the indirections through subsequent pipeline stages. Our techniques are applied over typical irregular kernels like the sparse matrix-vector multiply and the Euler solver. Execution times on the GeForce Series consistently outperform the Pentium 4 and Athlon 64 processors, with performance depending on floating-point precision.