NUMA-aware image compositing on multi-GPU platform

Authors:
Pan Wang;Zhiquan Cheng;Ralph Martin;Huahai Liu;Xun Cai;Sikun Li
Affiliations:
School of Computer Science, National University of Defense Technology, Hunan, China;School of Computer Science, National University of Defense Technology, Hunan, China;School of Computer Science & Informatics, Cardiff University, Cardiff, UK;School of Computer Science, National University of Defense Technology, Hunan, China;School of Computer Science, National University of Defense Technology, Hunan, China;School of Computer Science, National University of Defense Technology, Hunan, China
Venue:
The Visual Computer: International Journal of Computer Graphics
Year:
2013

Citing 12
Cited 0

A Sorting Classification of Parallel Rendering

IEEE Computer Graphics and Applications
Communication Costs for Parallel Volume-Rendering Algorithms

IEEE Computer Graphics and Applications
Parallel Volume Rendering Using Binary-Swap Compositing

IEEE Computer Graphics and Applications
Compositing digital images

SIGGRAPH '84 Proceedings of the 11th annual conference on Computer graphics and interactive techniques
SLIC: Scheduled Linear Image Compositing for Parallel Volume Rendering

PVG '03 Proceedings of the 2003 IEEE Symposium on Parallel and Large-Data Visualization and Graphics
Distributed texture memory in a multi-GPU environment

GH '06 Proceedings of the 21st ACM SIGGRAPH/EUROGRAPHICS symposium on Graphics hardware
Collective communication: theory, practice, and experience: Research Articles

Concurrency and Computation: Practice & Experience
A configurable algorithm for parallel image-compositing applications

Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis
Quantifying NUMA and contention effects in multi-GPU systems

Proceedings of the Fourth Workshop on General Purpose Processing on Graphics Processing Units
An image compositing solution at scale

Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis
Multi-GPU sort-last volume visualization

EG PGV'08 Proceedings of the 8th Eurographics conference on Parallel Graphics and Visualization
Accelerating and benchmarking radix-k image compositing at large scale

EG PGV'10 Proceedings of the 10th Eurographics conference on Parallel Graphics and Visualization

Quantified Score

Hi-index	0.00

Visualization

Abstract

Sort-last parallel rendering is widely used. Recent GPU developments mean that a PC equipped with multiple GPUs is a viable alternative to a high-cost supercomputer: the Fermi architecture of a single GPU supports uniform virtual addressing, providing a foundation for non-uniform memory access (NUMA) on multi-GPU platforms. Such hardware changes require the user to reconsider the parallel rendering algorithms. In this paper, we thoroughly investigate the NUMA-aware image compositing problem, which is the key final stage in sort-last parallel rendering. Based on a proven radix-k strategy, we find one optimal compositing algorithm, which takes advantage of NUMA architecture on the multi-GPU platform. We quantitatively analyze different image compositing modes for practical image compositing, taking into account peer-to-peer communication costs between GPUs. Our experiments on various datasets show that our image compositing method is very fast, an image of a few megapixels can be composited in about 10 ms by eight GPUs.