Automatic OpenMP loop scheduling: a combined compiler and runtime approach

Authors:
Peter Thoman;Herbert Jordan;Simone Pellegrini;Thomas Fahringer
Affiliations:
Distributed and Parallel Systems Group, University of Innsbruck, Innsbruck, Austria;Distributed and Parallel Systems Group, University of Innsbruck, Innsbruck, Austria;Distributed and Parallel Systems Group, University of Innsbruck, Innsbruck, Austria;Distributed and Parallel Systems Group, University of Innsbruck, Innsbruck, Austria
Venue:
IWOMP'12 Proceedings of the 8th international conference on OpenMP in a Heterogeneous World
Year:
2012

Citing 8
Cited 1

Trapezoid Self-Scheduling: A Practical Scheduling Scheme for Parallel Compilers

IEEE Transactions on Parallel and Distributed Systems
A practical automatic polyhedral parallelizer and locality optimizer

Proceedings of the 2008 ACM SIGPLAN conference on Programming language design and implementation
Mapping parallelism to multi-cores: a machine learning based approach

Proceedings of the 14th ACM SIGPLAN symposium on Principles and practice of parallel programming
Compiler-assisted dynamic scheduling for effective parallelization of loop nests on multicore processors

Proceedings of the 14th ACM SIGPLAN symposium on Principles and practice of parallel programming
Is the schedule clause really necessary in OpenMP?

WOMPAT'03 Proceedings of the OpenMP applications and tools 2003 international conference on OpenMP shared memory parallel programming
Evaluation of OpenMP task scheduling strategies

IWOMP'08 Proceedings of the 4th international conference on OpenMP in a new era of parallelism
ompVerify: polyhedral analysis for the OpenMP programmer

IWOMP'11 Proceedings of the 7th international conference on OpenMP in the Petascale era
The polyhedral model is more widely applicable than you think

CC'10/ETAPS'10 Proceedings of the 19th joint European conference on Theory and Practice of Software, international conference on Compiler Construction

INSPIRE: the insieme parallel intermediate representation

PACT '13 Proceedings of the 22nd international conference on Parallel architectures and compilation techniques

Quantified Score

Hi-index	0.00

Visualization

Abstract

The scheduling of parallel loops in OpenMP has been a research topic for over a decade. While many methods have been proposed, most focus on adapting the loop schedule purely at runtime, and without regard for the overall system state. We present a fully automatic loop scheduling policy that can adapt to both the characteristics of the input program as well as the current runtime behaviour of the system, including external load. Using state of the art polyhedral compiler analysis, we generate effort estimation functions that are then used by the runtime system to derive the optimal loop schedule for a given loop, work group size, iteration range and system state. We demonstrate performance improvements of up to 82% compared to default scheduling in an unloaded scenario, and up to 471% in a scenario with external load. We further show that even in the worst case, the results achieved by our automated system stay within 3% of the performance of a manually tuned strategy.