Is data distribution necessary in OpenMP?
Proceedings of the 2000 ACM/IEEE conference on Supercomputing
Performance characteristics of the multi-zone NAS parallel benchmarks
Journal of Parallel and Distributed Computing - Special issue: 18th International parallel and distributed processing symposium
Performance evaluation of a multi-zone application in different OpenMP approaches
International Journal of Parallel Programming
Nested parallelization with OpenMP
International Journal of Parallel Programming
Dynamic data migration for structured AMR solvers
International Journal of Parallel Programming
Towards many-core implementation of LU decomposition using Peano Curves
Proceedings of the combined workshops on UnConventional high performance computing workshop plus memory access workshop
Dynamic Task and Data Placement over NUMA Architectures: An OpenMP Runtime Perspective
IWOMP '09 Proceedings of the 5th International Workshop on OpenMP: Evolving OpenMP in an Age of Extreme Parallelism
NUMA-aware memory manager with dominant-thread-based copying GC
Proceedings of the 24th ACM SIGPLAN conference on Object oriented programming systems languages and applications
Affinity-on-next-touch: an extension to the Linux kernel for NUMA architectures
PPAM'09 Proceedings of the 8th international conference on Parallel processing and applied mathematics: Part I
Improving memory affinity of geophysics applications on NUMA platforms using minas
VECPAR'10 Proceedings of the 9th international conference on High performance computing for computational science
Improved scalability by using hardware-aware thread affinities
Facing the multicore-challenge
Improved scalability by using hardware-aware thread affinities
Facing the multicore-challenge
Scalable parallel AMG on ccNUMA machines with OpenMP
Computer Science - Research and Development
Parallelising computational microstructure simulations for metallic materials with OpenMP
IWOMP'11 Proceedings of the 7th international conference on OpenMP in the Petascale era
Towards NUMA support with distance information
IWOMP'11 Proceedings of the 7th international conference on OpenMP in the Petascale era
An approach to visualize remote socket traffic on the intel Nehalem-EX
Euro-Par 2010 Proceedings of the 2010 conference on Parallel processing
Exploiting thread-data affinity in OpenMP with data access patterns
Euro-Par'11 Proceedings of the 17th international conference on Parallel processing - Volume Part I
Critical path-based thread placement for NUMA systems
Proceedings of the second international workshop on Performance modeling, benchmarking and simulation of high performance computing systems
Object-oriented OpenMP programming with c++ and fortran
HPCS'09 Proceedings of the 23rd international conference on High Performance Computing Systems and Applications
Binding nested OpenMP programs on hierarchical memory architectures
IWOMP'10 Proceedings of the 6th international conference on Beyond Loop Level Parallelism in OpenMP: accelerators, Tasking and more
How OpenMP applications get more benefit from many-core era
IWOMP'10 Proceedings of the 6th international conference on Beyond Loop Level Parallelism in OpenMP: accelerators, Tasking and more
Performance evaluation of OpenMP-based algorithms for handling Kronecker descriptors
Journal of Parallel and Distributed Computing
A weighted-fair-queuing (WFQ)-based dynamic request scheduling approach in a multi-core system
Future Generation Computer Systems
Node-based memory management for scalable NUMA architectures
Proceedings of the 2nd International Workshop on Runtime and Operating Systems for Supercomputers
Nonuniform memory affinity strategy in multithreaded sparse matrix computations
Proceedings of the 2012 Symposium on High Performance Computing
The design of OpenMP thread affinity
IWOMP'12 Proceedings of the 8th international conference on OpenMP in a Heterogeneous World
Invasive computing: an application assisted resource management approach
MSEPT'12 Proceedings of the 2012 international conference on Multicore Software Engineering, Performance, and Tools
Critical path-based thread placement for NUMA systems
ACM SIGMETRICS Performance Evaluation Review
High throughput software for direct numerical simulations of compressible two-phase flows
SC '12 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
Task-parallel programming on NUMA architectures
Euro-Par'12 Proceedings of the 18th international conference on Parallel Processing
Automatic generation of program affinity policies using machine learning
CC'13 Proceedings of the 22nd international conference on Compiler Construction
Use of multiple GPUs on shared memory multiprocessors for ultrasound propagation simulations
AusPDC '12 Proceedings of the Tenth Australasian Symposium on Parallel and Distributed Computing - Volume 127
Dynamic thread pinning for phase-based OpenMP programs
Euro-Par'13 Proceedings of the 19th international conference on Parallel Processing
Assessing the performance of OpenMP programs on the intel xeon phi
Euro-Par'13 Proceedings of the 19th international conference on Parallel Processing
Hi-index | 0.00 |
The slogan of last year's International Workshop on OpenMP was "A Practical Programming Model for the Multi-Core Era", although OpenMP still is fully hardware architecture agnostic. As a consequence the programmer is left alone with bad performance if threads and data happen to live apart. In this work we examine the programmer's possibilities to improve data and thread affinity in OpenMP programs for several toy applications and present how to apply the lessons learned on larger application codes. We filled a gap by implementing explicit data migration on Linux providing a next touch mechanism.