Efficient scheduling of recursive control flow on GPUs
Proceedings of the 27th international ACM conference on International conference on supercomputing
Hi-index | 0.01 |
Motivated by the recent trend towards small-scale SIMD processing, in this paper we have addressed the vectorization of multigrid codes on modern microprocessors. The aim is to demonstrate that this relatively new feature can be beneficial not only for multimedia programs but also for such numerical codes. As target kernels we have considered both standard and robust multigrid algorithms, which demand different vectorization strategies. Furthermore, we have also studied the well-known NAS-MG program from the NAS Parallel benchmarks. In all cases, the performance benefits are quite satisfactory. The interest of this research is particularly relevant if we envisage using in-processor parallelism as a way to scale-up the speedup of other optimizations such as efficient memory-hierarchy exploitation or multiprocessor parallelization.