OpenCL implementation of particle swarm optimization: a comparison between multi-core CPU and GPU performances

Authors:
Stefano Cagnoni;Alessandro Bacchini;Luca Mussi
Affiliations:
Dept. of Information Engineering, University of Parma, Italy;Dept. of Information Engineering, University of Parma, Italy;Henesis s.r.l., Parma, Italy
Venue:
EvoApplications'12 Proceedings of the 2012t European conference on Applications of Evolutionary Computation
Year:
2012

Citing 6
Cited 0

GPU-based parallel particle swarm optimization

CEC'09 Proceedings of the Eleventh conference on Congress on Evolutionary Computation
Swarm's flight: accelerating the particles using C-CUDA

CEC'09 Proceedings of the Eleventh conference on Congress on Evolutionary Computation
Debunking the 100X GPU vs. CPU myth: an evaluation of throughput computing on CPU and GPU

Proceedings of the 37th annual international symposium on Computer architecture
Accelerating Particle Swarm Algorithm with GPGPU

PDP '11 Proceedings of the 2011 19th International Euromicro Conference on Parallel, Distributed and Network-Based Processing
GPU-based asynchronous particle swarm optimization

Proceedings of the 13th annual conference on Genetic and evolutionary computation
Evaluation of parallel particle swarm optimization algorithms within the CUDATM architecture

Information Sciences: an International Journal

Quantified Score

Hi-index	0.00

Visualization

Abstract

GPU-based parallel implementations of algorithms are usually compared against the corresponding sequential versions compiled for a single-core CPU machine, without taking advantage of the multi-core and SIMD capabilities of modern processors. This leads to unfair comparisons, where speed-up figures are much larger than what could actually be obtained if the CPU-based version were properly parallelized and optimized. The availability of OpenCL, which compiles parallel code for both GPUs and multi-core CPUs, has made it much easier to compare execution speed of different architectures fully exploiting each architecture's best features. We tested our latest parallel implementations of Particle Swarm Optimization (PSO), compiled under OpenCL for both GPUs and multi-core CPUs, and separately optimized for the two hardware architectures. Our results show that, for PSO, a GPU-based parallelization is still generally more efficient than a multi-core CPU-based one. However, the speed-up obtained by the GPU-based with respect to the CPU-based version is by far lower than the orders-of-magnitude figures reported by the papers which compare GPU-based parallel implementations to basic single-thread CPU code.