GPU-based iterative transmission reconstruction in 3D ultrasound computer tomography

Authors:
Matthias Birk;Robin Dapp;N. V. Ruiter;J. Becker
Affiliations:
-;-;-;-
Venue:
Journal of Parallel and Distributed Computing
Year:
2014

Citing 11
Cited 0

The mathematics of computerized tomography

The mathematics of computerized tomography
Sparse matrix solvers on the GPU: conjugate gradients and multigrid

ACM SIGGRAPH 2003 Papers
Numerical Optimization: Theoretical and Practical Aspects (Universitext)

Numerical Optimization: Theoretical and Practical Aspects (Universitext)
Implementing sparse matrix-vector multiplication on throughput-oriented processors

Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis
Algorithm for computer control of a digital plotter

IBM Systems Journal
The GPU Computing Era

IEEE Micro
High-performance 3D compressive sensing MRI reconstruction using many-core architectures

Journal of Biomedical Imaging - Special issue on Parallel Computation in Medical Imaging Applications
High-performance sparse matrix-vector multiplication on GPUs for structured grid computations

Proceedings of the 5th Annual Workshop on General Purpose Processing with Graphics Processing Units
Robust uncertainty principles: exact signal reconstruction from highly incomplete frequency information

IEEE Transactions on Information Theory
Compressed sensing

IEEE Transactions on Information Theory
GPU-accelerated preconditioned iterative linear solvers

The Journal of Supercomputing

Quantified Score

Hi-index	0.00

Visualization

Abstract

As today's standard screening methods frequently fail to detect breast cancer before metastases have developed, early diagnosis is still a major challenge. With the promise of high-quality volume images, three-dimensional ultrasound computer tomography is likely to improve this situation, but has high computational needs. In this work, we investigate the acceleration of the ray-based transmission reconstruction by a GPU-based implementation of the iterative numerical optimization algorithm TVAL3. We identified the regular and transposed sparse-matrix-vector multiply as the performance limiting operations. For accelerated reconstruction we propose two different concepts and devise a hybrid scheme as optimal configuration. In addition we investigate multi-GPU scalability and derive the optimal number of devices for our two primary use-cases: a fast preview mode and a high-resolution mode. In order to achieve a fair estimation of the speedup, we compare our implementation to an optimized CPU version of the algorithm. Using our accelerated implementation we reconstructed a preview 3D volume with 24,576 unknowns, a voxel size of (8 mm)^3 and approximately 200,000 equations in 0.5 s. A high-resolution volume with 1,572,864 unknowns, a voxel size of (2mm)^3 and approximately 1.6 million equations was reconstructed in 23 s. This constitutes an acceleration of over one order of magnitude in comparison to the optimized CPU version.