Managing Multicore with OpenMP (Extended Abstract)

Authors:
Barbara Chapman
Affiliations:
Department of Computer Science, University of Houston, Houston, USA TX 77204-3010
Venue:
Proceedings of the 15th European PVM/MPI Users' Group Meeting on Recent Advances in Parallel Virtual Machine and Message Passing Interface
Year:
2008

Citing 6
Cited 1

Optimizing Compiler for the CELL Processor

Proceedings of the 14th International Conference on Parallel Architectures and Compilation Techniques
EXOCHI: architecture and programming environment for a heterogeneous multi-core multithreaded system

Proceedings of the 2007 ACM SIGPLAN conference on Programming language design and implementation
Using OpenMP: Portable Shared Memory Parallel Programming (Scientific and Engineering Computation)

Using OpenMP: Portable Shared Memory Parallel Programming (Scientific and Engineering Computation)
Exploiting Loop-Level Parallelism for SIMD Arrays Using OpenMP

IWOMP '07 Proceedings of the 3rd international workshop on OpenMP: A Practical Programming Model for the Multi-Core Era
Exploring thread and memory placement on NUMA architectures: solaris and linux, UltraSPARC/FirePlane and opteron/hypertransport

HiPC'06 Proceedings of the 13th international conference on High Performance Computing
Toward enhancing OpenMP's work-sharing directives

Euro-Par'06 Proceedings of the 12th international conference on Parallel Processing

Automatic Hybrid MPI+OpenMP Code Generation with llc

Proceedings of the 16th European PVM/MPI Users' Group Meeting on Recent Advances in Parallel Virtual Machine and Message Passing Interface

Quantified Score

Hi-index	0.00

Visualization

Abstract

High end distributed and distributed shared memory platforms with many thousands of cores will be deployed in the coming years to solve the toughest technical problems. Their individual nodes will be heterogeneous multithreading, multicore systems, capable of executing many threads of control, and with a deep memory hierarchy. For example, the petascale architecture to be put in production at the US National Center for Supercomputing Applications (NCSA) in 2011 is based on the IBM Power7 chip which uses multicore processor technology. Thousands of compute nodes with over 200,000 cores are envisioned. The Roadrunner system that will be deployed at the Los Alamos National Laboratory (LANL) is expected to have heterogneous nodes, with both AMD Opterons and IBM Cells configured, and a similar number of cores.