Compilation for explicitly managed memory hierarchies
Proceedings of the 12th ACM SIGPLAN symposium on Principles and practice of parallel programming
EXOCHI: architecture and programming environment for a heterogeneous multi-core multithreaded system
Proceedings of the 2007 ACM SIGPLAN conference on Programming language design and implementation
CellSs: making it easier to program the cell broadband engine processor
IBM Journal of Research and Development
Merge: a programming model for heterogeneous multi-core systems
Proceedings of the 13th international conference on Architectural support for programming languages and operating systems
A Proposal for Task Parallelism in OpenMP
IWOMP '07 Proceedings of the 3rd international workshop on OpenMP: A Practical Programming Model for the Multi-Core Era
International Journal of Parallel Programming
CUDA-Lite: Reducing GPU Programming Complexity
Languages and Compilers for Parallel Computing
A Proposal to Extend the OpenMP Tasking Model for Heterogeneous Architectures
IWOMP '09 Proceedings of the 5th International Workshop on OpenMP: Evolving OpenMP in an Age of Extreme Parallelism
Offload – automating code migration to heterogeneous multicore systems
HiPEAC'10 Proceedings of the 5th international conference on High Performance Embedded Architectures and Compilers
Productive cluster programming with OmpSs
Euro-Par'11 Proceedings of the 17th international conference on Parallel processing - Volume Part I
Accelerating code on multi-cores with fastflow
Euro-Par'11 Proceedings of the 17th international conference on Parallel processing - Volume Part II
LARA: an aspect-oriented programming language for embedded systems
Proceedings of the 11th annual international conference on Aspect-oriented Software Development
On the instrumentation of OpenMP and ompss tasking constructs
Euro-Par'12 Proceedings of the 18th international conference on Parallel processing workshops
An application-centric evaluation of OpenCL on multi-core CPUs
Parallel Computing
Hi-index | 0.00 |
In this paper, we present OMPSs, a programming model based on OpenMP and StarSs, that can also incorporate the use of OpenCL or CUDA kernels. We evaluate the proposal on three different architectures, SMP, Cell/B.E. and GPUs, showing the wide usefulness of the approach. The evaluation is done with four different benchmarks, Matrix Multiply, BlackScholes, Perlin Noise, and Julia Set. We compare the results obtained with the execution of the same benchmarks written in OpenCL, in the same architectures. The results show that OMPSs greatly outperforms the OpenCL environment. It is more flexible to exploit multiple accelerators. And due to the simplicity of the annotations, it increases programmer's productivity.