High-performance cone beam reconstruction using CUDA compatible GPUs.pdf
文件大小:
908k
资源说明:Compute unified device architecture (CUDA) is a software development platform that
allows us to run C-like programs on the nVIDIA graphics processing unit (GPU). This paper
presents an acceleration method for cone beam reconstruction using CUDA compatible
GPUs. The proposed method accelerates the Feldkamp, Davis, and Kress (FDK) algorithm
using three techniques: (1) off-chip memory access reduction for saving the memory bandwidth;
(2) loop unrolling for hiding the memory latency; and (3) multithreading for
exploiting multiple GPUs. We describe how these techniques can be incorporated into
the reconstruction code. We also show an analytical model to understand the reconstruction
performance on multi-GPU environments. Experimental results show that the proposed
method runs at 83% of the theoretical memory bandwidth, achieving a throughput
of 64.3 projections per second (pps) for reconstruction of 5123-voxel volume from 360
5122-pixel projections. This performance is 41% higher than the previous CUDA-based
method and is 24 times faster than a CPU-based method optimized by vector intrinsics.
Some detailed analyses are also presented to understand how effectively the acceleration
techniques increase the reconstruction performance of a naive method. We also demonstrate
out-of-core reconstruction for large-scale datasets, up to 10243-voxel volume.
本源码包内暂不包含可直接显示的源代码文件,请下载源码包。