High-performance cone beam reconstruction using CUDA compatible GPUs

Authors:
Yusuke Okitsu;Fumihiko Ino;Kenichi Hagihara
Affiliations:
Graduate School of Information Science and Technology, Osaka University, 1-5 Yamada-oka, Suita, Osaka 565-0871, Japan;Graduate School of Information Science and Technology, Osaka University, 1-5 Yamada-oka, Suita, Osaka 565-0871, Japan;Graduate School of Information Science and Technology, Osaka University, 1-5 Yamada-oka, Suita, Osaka 565-0871, Japan
Venue:
Parallel Computing
Year:
2010

Citing 10
Cited 4

Accelerated volume rendering and tomographic reconstruction using texture mapping hardware

VVS '94 Proceedings of the 1994 symposium on Volume visualization
Cg: a system for programming graphics hardware in a C-like language

ACM SIGGRAPH 2003 Papers
A data distributed parallel algorithm for nonrigid image registration

Parallel Computing
Synergistic Processing in Cell's Multicore Architecture

IEEE Micro
Hardware/software 2D-3D backprojection on a SoPC platform

Proceedings of the 2006 ACM symposium on Applied computing
How GPUs Work

Computer
Validity of the single processor approach to achieving large scale computing capabilities

AFIPS '67 (Spring) Proceedings of the April 18-20, 1967, spring joint computer conference
Accelerating cone beam reconstruction using the CUDA-enabled GPU

HiPC'08 Proceedings of the 15th international conference on High performance computing
A code motion technique for accelerating general-purpose computation on the GPU

IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
High-performance computing service over the Internet for intraoperative image processing

IEEE Transactions on Information Technology in Biomedicine

Development of a GPU-based high-performance radiative transfer model for the Infrared Atmospheric Sounding Interferometer (IASI)

Journal of Computational Physics
Software architecture for multi-bed FDK-based reconstruction in X-ray CT scanners

Computer Methods and Programs in Biomedicine
Acceleration of CT reconstruction for wheat tiller inspection based on adaptive minimum enclosing rectangle

Computers and Electronics in Agriculture
Pushing the limits for medical image reconstruction on recent standard multicore processors

International Journal of High Performance Computing Applications

Quantified Score

Hi-index	0.00

Visualization

Abstract

Compute unified device architecture (CUDA) is a software development platform that allows us to run C-like programs on the nVIDIA graphics processing unit (GPU). This paper presents an acceleration method for cone beam reconstruction using CUDA compatible GPUs. The proposed method accelerates the Feldkamp, Davis, and Kress (FDK) algorithm using three techniques: (1) off-chip memory access reduction for saving the memory bandwidth; (2) loop unrolling for hiding the memory latency; and (3) multithreading for exploiting multiple GPUs. We describe how these techniques can be incorporated into the reconstruction code. We also show an analytical model to understand the reconstruction performance on multi-GPU environments. Experimental results show that the proposed method runs at 83% of the theoretical memory bandwidth, achieving a throughput of 64.3 projections per second (pps) for reconstruction of 512^3-voxel volume from 360 512^2-pixel projections. This performance is 41% higher than the previous CUDA-based method and is 24 times faster than a CPU-based method optimized by vector intrinsics. Some detailed analyses are also presented to understand how effectively the acceleration techniques increase the reconstruction performance of a naive method. We also demonstrate out-of-core reconstruction for large-scale datasets, up to 1024^3-voxel volume.