NUMA-aware image compositing on multi-GPU platform

  • Authors:
  • Pan Wang;Zhiquan Cheng;Ralph Martin;Huahai Liu;Xun Cai;Sikun Li

  • Affiliations:
  • School of Computer Science, National University of Defense Technology, Hunan, China;School of Computer Science, National University of Defense Technology, Hunan, China;School of Computer Science & Informatics, Cardiff University, Cardiff, UK;School of Computer Science, National University of Defense Technology, Hunan, China;School of Computer Science, National University of Defense Technology, Hunan, China;School of Computer Science, National University of Defense Technology, Hunan, China

  • Venue:
  • The Visual Computer: International Journal of Computer Graphics
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

Sort-last parallel rendering is widely used. Recent GPU developments mean that a PC equipped with multiple GPUs is a viable alternative to a high-cost supercomputer: the Fermi architecture of a single GPU supports uniform virtual addressing, providing a foundation for non-uniform memory access (NUMA) on multi-GPU platforms. Such hardware changes require the user to reconsider the parallel rendering algorithms. In this paper, we thoroughly investigate the NUMA-aware image compositing problem, which is the key final stage in sort-last parallel rendering. Based on a proven radix-k strategy, we find one optimal compositing algorithm, which takes advantage of NUMA architecture on the multi-GPU platform. We quantitatively analyze different image compositing modes for practical image compositing, taking into account peer-to-peer communication costs between GPUs. Our experiments on various datasets show that our image compositing method is very fast, an image of a few megapixels can be composited in about 10 ms by eight GPUs.