Scaling non-regular shared-memory codes by reusing custom loop schedules

Authors:
Dimitrios S. Nikolopoulos;Ernest Artiaga;Eduard Ayguadé;Jesús Labarta
Affiliations:
Coord. Sci. Lab., Univ. of Ill. at Urbana-Champaign, Urbana, IL 61801, USA. dsn@csrd.uiuc.edu (Correspd. Dept. of Comp. Sci., The Coll. of William&Mary, Williamsburg, VA 23187-8795, USA. Tel.: +1 ...;Department d' Arquitectura de Computadors, Universitat Politecnica de Catalunya, c/Jordi Girona 1-3, Modul D6, Barcelona 08034, Spain. E-mail: {ernest, eduard, jesus}@ac.upc.es;Department d' Arquitectura de Computadors, Universitat Politecnica de Catalunya, c/Jordi Girona 1-3, Modul D6, Barcelona 08034, Spain. E-mail: {ernest, eduard, jesus}@ac.upc.es;Department d' Arquitectura de Computadors, Universitat Politecnica de Catalunya, c/Jordi Girona 1-3, Modul D6, Barcelona 08034, Spain. E-mail: {ernest, eduard, jesus}@ac.upc.es
Venue:
Scientific Programming - OpenMP
Year:
2003

Citing 14
Cited 1

Data and computation transformations for multiprocessors

PPOPP '95 Proceedings of the fifth ACM SIGPLAN symposium on Principles and practice of parallel programming
Operating system support for improving data locality on CC-NUMA compute servers

Proceedings of the seventh international conference on Architectural support for programming languages and operating systems
Data distribution support on distributed shared memory multiprocessors

Proceedings of the ACM SIGPLAN 1997 conference on Programming language design and implementation
The SGI Origin: a ccNUMA highly scalable server

Proceedings of the 24th annual international symposium on Computer architecture
High-level management of communication schedules in HPF-like languages

ICS '98 Proceedings of the 12th international conference on Supercomputing
Is data distribution necessary in OpenMP?

Proceedings of the 2000 ACM/IEEE conference on Supercomputing
Extending OpenMP for NUMA machines

Proceedings of the 2000 ACM/IEEE conference on Supercomputing
The trade-off between implicit and explicit data distribution in shared-memory programming paradigms

ICS '01 Proceedings of the 15th international conference on Supercomputing
Architecture and design of AlphaServer GS320

ASPLOS IX Proceedings of the ninth international conference on Architectural support for programming languages and operating systems
Scalable Shared-Memory Multiprocessing

Scalable Shared-Memory Multiprocessing
Scaling irregular parallel codes with minimal programming effort

Proceedings of the 2001 ACM/IEEE conference on Supercomputing
Dynamic page placement to improve locality in CC-NUMA multiprocessors for TPC-C

Proceedings of the 2001 ACM/IEEE conference on Supercomputing
Using simple page placement policies to reduce the cost of cache fills in coherent shared-memory systems

IPPS '95 Proceedings of the 9th International Symposium on Parallel Processing
WildFire: A Scalable Path for SMPs

HPCA '99 Proceedings of the 5th International Symposium on High Performance Computer Architecture

Exploiting thread-data affinity in OpenMP with data access patterns

Euro-Par'11 Proceedings of the 17th international conference on Parallel processing - Volume Part I

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper we explore the idea of customizing and reusing loop schedules to improve the scalability of non-regular numerical codes in shared-memory architectures with non-uniform memory access latency. The main objective is to implicitly setup affinity links between threads and data, by devising loop schedules that achieve balanced work distribution within irregular data spaces and reusing them as much as possible along the execution of the program for better memory access locality. This transformation provides a great deal of flexibility in optimizing locality, without compromising the simplicity of the shared-memory programming paradigm. In particular, the programmer does not need to explicitly distribute data between processors. The paper presents practical examples from real applications and experiments showing the efficiency of the approach.