Journal of Computational Physics
A modified tree code: don't laugh; it runs
Journal of Computational Physics
Journal of Parallel and Distributed Computing
BOINC: A System for Public-Resource Computing and Storage
GRID '04 Proceedings of the 5th IEEE/ACM International Workshop on Grid Computing
Aspect-Oriented Parallel Discrete Optimization on the Cohesion Desktop Grid Platform
CCGRID '06 Proceedings of the Sixth IEEE International Symposium on Cluster Computing and the Grid
COHESION - A microkernel based Desktop Grid platform for irregular task-parallel applications
Future Generation Computer Systems
Scaling Hierarchical N-body Simulations on GPU Clusters
Proceedings of the 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis
A sparse octree gravitational N-body code that runs entirely on the GPU processor
Journal of Computational Physics
A Correlated Resource Model of Internet End Hosts
IEEE Transactions on Parallel and Distributed Systems
Parallel implicit integration for cloth animations on distributed memory architectures
EG PGV'04 Proceedings of the 5th Eurographics conference on Parallel Graphics and Visualization
Hi-index | 0.00 |
Using the Barnes-Hut algorithm as an example we deal with the design of parallel algorithms that are able to exploit multicore CPUs and GPUs conjointly. Specifically, we demonstrate how to modularize a parallel application according to specific aspects of parallel execution. This allows for a flexible assignment of individual modules to the two parallel architectures based on their actual performance characteristics. Furthermore, we discuss a hybrid module for the most time consuming part of the algorithm that utilizes CPU and GPU simultaneously employing a novel load balancing heuristic. Our experimental evaluation shows that our method greatly increases overall efficiency by allowing to deploy the optimal configuration of modules for each individual computer system.