Accelerated volume rendering and tomographic reconstruction using texture mapping hardware
VVS '94 Proceedings of the 1994 symposium on Volume visualization
Rapid emission tomography reconstruction
VG '03 Proceedings of the 2003 Eurographics/IEEE TVCG Workshop on Volume graphics
Exploring Graphics Processor Performance for General Purpose Applications
DSD '05 Proceedings of the 8th Euromicro Conference on Digital System Design
Accelerator: using data parallelism to program GPUs for general-purpose uses
Proceedings of the 12th international conference on Architectural support for programming languages and operating systems
Optimization principles and application performance evaluation of a multithreaded GPU using CUDA
Proceedings of the 13th ACM SIGPLAN Symposium on Principles and practice of parallel programming
Program optimization space pruning for a multithreaded gpu
Proceedings of the 6th annual IEEE/ACM international symposium on Code generation and optimization
Nonuniform fast Fourier transforms using min-max interpolation
IEEE Transactions on Signal Processing
IEEE Transactions on Signal Processing
Accelerating advanced MRI reconstructions on GPUs
Journal of Parallel and Distributed Computing
CUDA-Lite: Reducing GPU Programming Complexity
Languages and Compilers for Parallel Computing
Accelerating total variation regularization for matrix-valued images on GPUs
Proceedings of the 6th ACM conference on Computing frontiers
Towards Large-Scale Molecular Dynamics Simulations on Graphics Processors
BICoB '09 Proceedings of the 1st International Conference on Bioinformatics and Computational Biology
Efficient Mapping of Multiresolution Image Filtering Algorithms on Graphics Processors
SAMOS '09 Proceedings of the 9th International Workshop on Embedded Computer Systems: Architectures, Modeling, and Simulation
Performance Optimization Strategies of High Performance Computing on GPU
APPT '09 Proceedings of the 8th International Symposium on Advanced Parallel Processing Technologies
Multi GPU implementation of iterative tomographic reconstruction algorithms
ISBI'09 Proceedings of the Sixth IEEE international conference on Symposium on Biomedical Imaging: From Nano to Macro
Biomedical imaging ecosystem and the role of the GPU
ISBI'09 Proceedings of the Sixth IEEE international conference on Symposium on Biomedical Imaging: From Nano to Macro
Efficient compilation of fine-grained SPMD-threaded programs for multicore CPUs
Proceedings of the 8th annual IEEE/ACM international symposium on Code generation and optimization
MEDICS: ultra-portable processing for medical image reconstruction
Proceedings of the 19th international conference on Parallel architectures and compilation techniques
An efficient implementation of GPU virtualization in high performance clusters
Euro-Par'09 Proceedings of the 2009 international conference on Parallel processing
Hi-index | 0.00 |
Computational acceleration on graphics processing units (GPUs) can make advanced magnetic resonance imaging (MRI) reconstruction algorithms attractive in clinical settings, thereby improving the quality of MR images across a broad spectrum of applications. At present, MR imaging is often limited by high noise levels, significant imaging artifacts, and/or long data acquisition (scan) times. Advanced image reconstruction algorithms can mitigate these limitations and improve image quality by simultaneously operating on scan data acquired with arbitrary trajectories and incorporating additional information such as anatomical constraints. However, the improvements in image quality come at the expense of a considerable increase in computation. This paper describes the acceleration of an advanced reconstruction algorithm on NVIDIA's Quadro FX 5600. Optimizations such as register allocating the voxel data, tiling the scan data, and storing the scan data in the Quadro's constant memory dramatically reduce the reconstruction's required bandwidth to on-chip memory. The Quadro's special functional units provide substantial acceleration of the trigonometric computations in the algorithm's inner loops, and experimentally-tuned code transformations increase the reconstruction's performance by an additional 20%. The reconstruction of a 3D image with 128^3 voxels ultimately achieves 150 GFLOPS and requires less than two minutes on the Quadro, while reconstruction on a quad-core CPU is thirteen times slower. Furthermore, relative to the true image, the error exhibited by the advanced reconstruction is only 12%, while conventional reconstruction techniques incur error of 42%. In short, the acceleration afforded by the GPU greatly increases the appeal of the advanced reconstruction for clinical MRI applications.