Dynamic data migration for structured AMR solvers

Authors:
Markus Nordén;Henrik Löf;Jarmo Rantakokko;Sverker Holmgren
Affiliations:
Department of Information Technology, Uppsala University, Uppsalas, Sweden;Department of Information Technology, Uppsala University, Uppsalas, Sweden;Department of Information Technology, Uppsala University, Uppsalas, Sweden;Department of Information Technology, Uppsala University, Uppsalas, Sweden
Venue:
International Journal of Parallel Programming
Year:
2007

Citing 17
Cited 4

Translation-Lookaside Buffer Consistency

Computer
The SGI Origin: a ccNUMA highly scalable server

Proceedings of the 24th annual international symposium on Computer architecture
Parallel dynamic graph partitioning for adaptive unstructured meshes

Journal of Parallel and Distributed Computing - Special issue on dynamic load balancing
A Fast and High Quality Multilevel Scheme for Partitioning Irregular Graphs

SIAM Journal on Scientific Computing
Partitioning strategies for structured multiblock grids

Parallel Computing - Special issue on graph partioning and parallel computing
A unified algorithm for load-balancing adaptive scientific simulations

Proceedings of the 2000 ACM/IEEE conference on Supercomputing
Highly parallel structured adaptive mesh refinement using parallel language-based approaches

Parallel Computing - new trends in high performance computing
Large scale parallel structured AMR calculations using the SAMRAI framework

Proceedings of the 2001 ACM/IEEE conference on Supercomputing
Dynamic page placement to improve locality in CC-NUMA multiprocessors for TPC-C

Proceedings of the 2001 ACM/IEEE conference on Supercomputing
DRAMA: A Library for Parallel Dynamic Load Balancing of Finite Element Applications

Euro-Par '99 Proceedings of the 5th International Euro-Par Conference on Parallel Processing
Evaluation of the memory page migration influence in the system performance: the case of the SGI O2000

ICS '03 Proceedings of the 17th annual international conference on Supercomputing
Using Hardware Counters to Automatically Improve Memory Performance

Proceedings of the 2004 ACM/IEEE conference on Supercomputing
affinity-on-next-touch: increasing the performance of an industrial PDE solver on a cc-NUMA system

Proceedings of the 19th annual international conference on Supercomputing
Space---Time Adaptive Solution of First Order PDES

Journal of Scientific Computing
Load balancing and OpenMP implementation of nested parallelism

Parallel Computing - OpenMp
Racoon: a parallel mesh-adaptive framework for hyperbolic conservation laws

Parallel Computing
Extending OpenMP for NUMA machines

Scientific Programming

Data and thread affinity in openmp programs

Proceedings of the 2008 workshop on Memory access on future processors: a solved problem?
Towards NUMA support with distance information

IWOMP'11 Proceedings of the 7th international conference on OpenMP in the Petascale era
Mapping applications for high performance on multithreaded, NUMA systems

Proceedings of the ACM International Conference on Computing Frontiers
Maximizing the performance of irregular applications on multithreaded, NUMA systems

IA^3 '13 Proceedings of the 3rd Workshop on Irregular Applications: Architectures and Algorithms

Quantified Score

Hi-index	0.00

Visualization

Abstract

On cc-NUMA multi-processors, the non-uniformity of main memory latencies motivates the need for co-location of threads and data. We call this special form of data locality, geographical locality. In this article, we study the performance of a parallel PDE solver with adaptive mesh refinement (AMR). The solver is parallelized using OpenMP and the adaptive mesh refinement makes dynamic load balancing necessary. Due to the dynamically changing memory access pattern caused by the runtime adaption, it is a challenging task to achieve a high degree of geographical locality. The main conclusions of the study are: (1) that geographical locality is very important for the performance of the solver, (2) that the performance can be improved significantly using dynamic page migration of misplaced data, (3) that a migrate-on-next-touch directive works well whereas the first-touch strategy is less advantageous for programs exhibiting a dynamically changing memory access patterns, and (4) that the overhead for such migration is low compared to the total execution time.