Box-counting algorithm on GPU and multi-core CPU: an OpenCL cross-platform study

Authors:
Jesús Jiménez;Juan Ruiz De Miras
Affiliations:
Department of Computer Science, University of Jaén, Jaén, Spain 23071;Department of Computer Science, University of Jaén, Jaén, Spain 23071
Venue:
The Journal of Supercomputing
Year:
2013

Citing 11
Cited 0

Radix sort for vector multiprocessors

Proceedings of the 1991 ACM/IEEE conference on Supercomputing
Designing efficient sorting algorithms for manycore GPUs

IPDPS '09 Proceedings of the 2009 IEEE International Symposium on Parallel&Distributed Processing
On the Fractal Dimension of Isosurfaces

IEEE Transactions on Visualization and Computer Graphics
Comparing Hardware Accelerators in Scientific Applications: A Case Study

IEEE Transactions on Parallel and Distributed Systems
A fast MATLAB program to estimate the multifractal spectrum of multidimensional data: Application to fractures

Computers & Geosciences
UJA-3DFD: A program to compute the 3D fractal dimension from MRI data

Computer Methods and Programs in Biomedicine
Analysis of Fast Parallel Sorting Algorithms for GPU Architectures'

FIT '11 Proceedings of the 2011 Frontiers of Information Technology
From CUDA to OpenCL: Towards a performance-portable solution for multi-platform GPU programming

Parallel Computing
Parallel computing of 3D smoking simulation based on OpenCL heterogeneous platform

The Journal of Supercomputing
A Survey of Parallel Programming Models and Tools in the Multi and Many-Core Era

IEEE Transactions on Parallel and Distributed Systems
Fast box-counting algorithm on GPU

Computer Methods and Programs in Biomedicine

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper, we present the analysis and development of a cross-platform OpenCL implementation of the box-counting algorithm, which is one of the most widely-used methods for estimating the Fractal Dimension. The Fractal Dimension is a relevant image analysis method used in several disciplines, but computing it is in general a time consuming process, especially when working with 3D images. Unlike parallel programming models that strictly depend on the hardware type and manufacturer, like CUDA, OpenCL allows us to provide an implementation suitable for execution on both GPUs and multi-core CPUs, whatever the hardware manufacturer. Sorting is a key part of the fast box-counting algorithm and the final speedup is highly conditioned by the efficiency of the sorting algorithm used. Our study reveals that current OpenCL implementations of sorting algorithms are clearly slower when compared with both CUDA for GPU and specific multi-core CPU implementations. Our OpenCL algorithm has been specifically optimized according the type of the target device and the results show an average speedup of up to 7.46脳 and 4脳, when executed on the GPU and the multi-core CPU respectively, both compared with the single-threaded (sequential) CPU implementation.