Isomorphism and the N-Queens problem
ACM SIGCSE Bulletin
Asynchronous Problems on SIMD Parallel Computers
IEEE Transactions on Parallel and Distributed Systems
Explicit SIMD Programming for Asynchronous Applications
ASAP '00 Proceedings of the IEEE International Conference on Application-Specific Systems, Architectures, and Processors
N-queens pattern generation: an insight into space complexity of a backtracking algorithm
ISICT '04 Proceedings of the 2004 international symposium on Information and communication technologies
Dynamic Warp Formation and Scheduling for Efficient GPU Control Flow
Proceedings of the 40th Annual IEEE/ACM International Symposium on Microarchitecture
On dynamic load balancing on graphics processors
Proceedings of the 23rd ACM SIGGRAPH/EUROGRAPHICS symposium on Graphics hardware
Understanding the efficiency of ray traversal on GPUs
Proceedings of the Conference on High Performance Graphics 2009
Efficient band approximation of Gram matrices for large scale kernel methods on GPUs
Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis
Taming irregular EDA applications on GPUs
Proceedings of the 2009 International Conference on Computer-Aided Design
Proceedings of the 24th ACM International Conference on Supercomputing
On the limits of GPU acceleration
HotPar'10 Proceedings of the 2nd USENIX conference on Hot topics in parallelism
Task management for irregular-parallel workloads on the GPU
Proceedings of the Conference on High Performance Graphics
Hi-index | 0.00 |
While graphics processing units (GPUs) show high performance for problems with regular structures, they do not perform well for irregular tasks due to the mismatches between irregular problem structures and SIMD-like GPU architectures. In this paper, we explore software approaches for improving the performance of irregular parallel computation on graphics processors. We propose general approaches that can eliminate the branch divergence and allow runtime load balancing. We evaluate the optimization rules and approaches with the n-queens problem benchmark. The experimental results show that the proposed approaches can substantially improve the performance of irregular computation on GPUs. These general approaches could be easily applied to many other irregular problems to improve their performance.