Practical Compiler Techniques on Efficient Multithreaded Code Generation for OpenMP Programs

Authors:
Xinmin Tian;Milind Girkar;Aart Bik;Hideki Saito
Affiliations:
Intel Compiler Labs, Software and Solutions Group, Intel Corporation 3600 Juliette Lane, Santa Clara, CA 95052, USA;Intel Compiler Labs, Software and Solutions Group, Intel Corporation 3600 Juliette Lane, Santa Clara, CA 95052, USA;Intel Compiler Labs, Software and Solutions Group, Intel Corporation 3600 Juliette Lane, Santa Clara, CA 95052, USA;Intel Compiler Labs, Software and Solutions Group, Intel Corporation 3600 Juliette Lane, Santa Clara, CA 95052, USA
Venue:
The Computer Journal
Year:
2005

Citing 0
Cited 5

Lightweight lock-free synchronization methods for multithreading

Proceedings of the 20th annual international conference on Supercomputing
Programming shared memory multiprocessors with deterministic message-passing concurrency: compiling SHIM to Pthreads

Proceedings of the conference on Design, automation and test in Europe
A case study on compiler optimizations for the Intel® Core™ 2 duo processor

International Journal of Parallel Programming
Scheduling dynamic OpenMP applications over multicore architectures

IWOMP'08 Proceedings of the 4th international conference on OpenMP in a new era of parallelism
µTC: an intermediate language for programming chip multiprocessors

ACSAC'06 Proceedings of the 11th Asia-Pacific conference on Advances in Computer Systems Architecture

Quantified Score

Hi-index	0.00

Visualization

Abstract

State-of-the-art multiprocessor systems pose several difficulties: (i) the user has to parallelize the existing serial code; (ii) explicitly threaded programs using a thread library are not portable; (iii) writing efficient multi-threaded programs requires intimate knowledge of machine's architecture and micro-architecture. Thus, well-tuned parallelizing compilers are in high demand to leverage state-of-the-art computer advances of NUMA-based multiprocessors, simultaneous multi-threading processors and chip-multiprocessor systems in response to the performance quest from the high-performance computing community. On the other hand, OpenMP* has emerged as the industry standard parallel programming model. Applications can be parallelized using OpenMP with less effort in a way that is portable across a wide range of multiprocessor systems. In this paper, we present several practical compiler optimization techniques and discuss their effect on the performance of OpenMP programs. We elaborate on the major design considerations in a high performance OpenMP compiler and present experimental data based on the implementation of the optimizations in the Intel® C++ and Fortran compilers. Interactions of the OpenMP transformation with other sequential optimizations in the compiler are discussed. The techniques in this paper have achieved significant performance improvements on the industry standard SPEC* OMPM2001 and SPEC* OMPL2001 benchmarks, and these performance results are presented for Intel® Pentium® and Itanium® processor based systems.