A performance study of general-purpose applications on graphics processors using CUDA
Journal of Parallel and Distributed Computing
Fast parallel GPU-sorting using a hybrid algorithm
Journal of Parallel and Distributed Computing
Parallel Computing Experiences with CUDA
IEEE Micro
Parallel Image Processing Based on CUDA
CSSE '08 Proceedings of the 2008 International Conference on Computer Science and Software Engineering - Volume 03
A Parallel Implementation of the 2D Wavelet Transform Using CUDA
PDP '09 Proceedings of the 2009 17th Euromicro International Conference on Parallel, Distributed and Network-based Processing
Hi-index | 0.00 |
This paper describes the implementation of an algorithm for surface error detection on ceramic tiles in CUDA (Compute Unified Device Architecture). It compares the differences between the CPU and the GPU algorithm implementation, analyzes the features of CUDA GPU and summarizes the general programming model of CUDA. Paper presents the speed up gained in favor of the GPU algorithm implementation. Implemented algorithm used in this paper written in C is relatively simple, and for test results version for the CPU was made and the GPU version. The results show the speed up of the computation compared with the CPU that increases as the image size increases, with the maximum speed up of 4,89 times.