A parallel implementation of genetic programming that achieves super-linear performance
Information Sciences: an International Journal - special issue on parallel and distributed processing
An Empirical Study of Multipopulation Genetic Programming
Genetic Programming and Evolvable Machines
Proceedings of the European Conference on Genetic Programming
Algorithm evolution with internal reinforcement for signal understanding
Algorithm evolution with internal reinforcement for signal understanding
Fast Genetic Programming and Artificial Developmental Systems on GPUs
HPCS '07 Proceedings of the 21st International Symposium on High Performance Computing Systems and Applications
A data parallel approach to genetic programming using programmable graphics hardware
Proceedings of the 9th annual conference on Genetic and evolutionary computation
Population parallel GP on the G80 GPU
EuroGP'08 Proceedings of the 11th European conference on Genetic programming
Hardware accelerators for Cartesian genetic programming
EuroGP'08 Proceedings of the 11th European conference on Genetic programming
GPU-based island model for evolutionary algorithms
Proceedings of the 12th annual conference on Genetic and evolutionary computation
Bitwise operations for GPU implementation of genetic algorithms
Proceedings of the 13th annual conference companion on Genetic and evolutionary computation
A many threaded CUDA interpreter for genetic programming
EuroGP'10 Proceedings of the 13th European conference on Genetic Programming
Hi-index | 0.00 |
In this paper, we describe our work to investigate how much cyclic graph based Genetic Programming (GP) can be accelerated on one machine using currently available mid-range Graphics Processing Units (GPUs). Cyclic graphs pose different problems for evaluation than do trees and we describe how our CUDA based, "population parallel" evaluator tackles these problems. Previous similar work has focused on the evaluation alone. Unfortunately large reductions in the evaluation time do not necessarily translate to similar reductions in the total run time because the time spent on other tasks becomes more significant. We show that this problem can be tackled by having the GPU execute in parallel with the Central Processing Unit (CPU) and with memory transfers. We also demonstrate that it is possible to use a second graphics card to further improve the acceleration of one machine. These additional techniques are able to reduce the total run time of the GPU system by up to 2.83 times. The combined architecture completes a full cyclic GP run 434.61 times faster than the single-core CPU equivalent. This involves evaluating at an average rate of 3.85 billion GP operations per second over the course of the whole run.