Cell broadband engine architecture and its first implementation: a performance view
IBM Journal of Research and Development
Prefetching irregular references for software cache on cell
Proceedings of the 6th annual IEEE/ACM international symposium on Code generation and optimization
Hybrid access-specific software cache techniques for the cell BE architecture
Proceedings of the 17th international conference on Parallel architectures and compilation techniques
A Proposal to Extend the OpenMP Tasking Model for Heterogeneous Architectures
IWOMP '09 Proceedings of the 5th International Workshop on OpenMP: Evolving OpenMP in an Age of Extreme Parallelism
Implementing the PGI Accelerator model
Proceedings of the 3rd Workshop on General-Purpose Computation on Graphics Processing Units
Optimizing the use of static buffers for DMA on a CELL chip
LCPC'06 Proceedings of the 19th international conference on Languages and compilers for parallel computing
Extending the OpenMP standard for thread mapping and grouping
IWOMP'05/IWOMP'06 Proceedings of the 2005 and 2006 international conference on OpenMP shared memory parallel programming
Towards an error model for OpenMP
IWOMP'10 Proceedings of the 6th international conference on Beyond Loop Level Parallelism in OpenMP: accelerators, Tasking and more
Analysis of task offloading for accelerators
HiPEAC'10 Proceedings of the 5th international conference on High Performance Embedded Architectures and Compilers
A compiler-assisted runtime-prefetching scheme for heterogeneous platforms
IWOMP'12 Proceedings of the 8th international conference on OpenMP in a Heterogeneous World
Improving the programmability of STHORM-based heterogeneous systems with offload-enabled OpenMP
Proceedings of the First International Workshop on Many-core Embedded Systems
Hi-index | 0.00 |
Modern architectures are becoming more heterogeneous. OpenMP currently has no mechanism for assigning work to specific parts of these heterogeneous architectures.We propose a combination of thread mapping and subteams as a means to give programmers control over how work is allocated on these architectures. Experiments with a prototype implementation on the Cell Broadband Engine show the benefit of allowing OpenMP teams to be created across the different elements of a heterogeneous architecture.