Exploiting thread-data affinity in OpenMP with data access patterns

Authors:
Andrea Di Biagio;Ettore Speziale;Giovanni Agosta
Affiliations:
Dipartimento di Elettronica ed Informazione, Politecnico di Milano;Dipartimento di Elettronica ed Informazione, Politecnico di Milano;Dipartimento di Elettronica ed Informazione, Politecnico di Milano
Venue:
Euro-Par'11 Proceedings of the 17th international conference on Parallel processing - Volume Part I
Year:
2011

Citing 14
Cited 0

Guided self-scheduling: A practical scheduling scheme for parallel supercomputers

IEEE Transactions on Computers
High performance Fortran language specification

ACM SIGPLAN Fortran Forum
Exploiting locality and tolerating remote memory access latency using thread migration

International Journal of Parallel Programming - Special issue: selected papers from PACT'96, fourth international conference on parallel architectures and compilation techniques—part 1
Extending OpenMP for NUMA machines

Proceedings of the 2000 ACM/IEEE conference on Supercomputing
Using Hardware Counters to Automatically Improve Memory Performance

Proceedings of the 2004 ACM/IEEE conference on Supercomputing
Hardware profile-guided automatic page placement for ccNUMA systems

Proceedings of the eleventh ACM SIGPLAN symposium on Principles and practice of parallel programming
A transparent runtime data distribution engine for OpenMP

Scientific Programming
Scaling non-regular shared-memory codes by reusing custom loop schedules

Scientific Programming - OpenMP
Data and thread affinity in openmp programs

Proceedings of the 2008 workshop on Memory access on future processors: a solved problem?
Dynamic Task and Data Placement over NUMA Architectures: An OpenMP Runtime Perspective

IWOMP '09 Proceedings of the 5th International Workshop on OpenMP: Evolving OpenMP in an Age of Extreme Parallelism
Enabling high-performance memory migration for multithreaded applications on LINUX

IPDPS '09 Proceedings of the 2009 IEEE International Symposium on Parallel&Distributed Processing
OpenMP and NUMA architectures I: Investigating memory placement on the SGI origin 3000

ICCS'03 Proceedings of the 2003 international conference on Computational science
Scheduling dynamic OpenMP applications over multicore architectures

IWOMP'08 Proceedings of the 4th international conference on OpenMP in a new era of parallelism
Affinity-on-next-touch: an extension to the Linux kernel for NUMA architectures

PPAM'09 Proceedings of the 8th international conference on Parallel processing and applied mathematics: Part I

Quantified Score

Hi-index	0.00

Visualization

Abstract

In modern NUMA architectures, preserving data access locality is a key issue to guarantee performance. We define, for the OpenMP programming model, a type of architecture-agnostic programmer hint to describe the behaviour of parallel loops. These hints are only related to features of the program, in particular to the data accessed by each loop iteration. The runtime will then combine this information with architectural information gathered during its initialization, to guide task scheduling, in case of dynamic loop iteration scheduling. We prove the effectiveness of the proposed technique on the NAS parallel benchmark suite, achieving an average speedup of 1.21x.