MARS: A Distributed Memory Approach to Shared Memory Compilation

Authors:
Michael F. P. O'Boyle
Affiliations:
-
Venue:
LCR '98 Selected Papers from the 4th International Workshop on Languages, Compilers, and Run-Time Systems for Scalable Computers
Year:
1998

Citing 14
Cited 0

Array privatization for parallel execution of loops

ICS '92 Proceedings of the 6th international conference on Supercomputing
Unified compilation of Fortran 77D and 90D

ACM Letters on Programming Languages and Systems (LOPLAS)
Unifying data and control transformations for distributed shared-memory machines

PLDI '95 Proceedings of the ACM SIGPLAN 1995 conference on Programming language design and implementation
Synchronization minimization in a SPMD execution model

Journal of Parallel and Distributed Computing - Special issue on distributed shared memory systems
A compiler algorithm for optimizing locality in loop nests

ICS '97 Proceedings of the 11th international conference on Supercomputing
Maximizing Multiprocessor Performance with the SUIF Compiler

Computer
On Privatization of Variables for Data-Parallel Execution

IPPS '97 Proceedings of the 11th International Symposium on Parallel Processing
A Data Partitioning Algorithm for Distributed Memory Compilation

PARLE '94 Proceedings of the 6th International PARLE Conference on Parallel Architectures and Languages Europe
KOAN: A Shared Virtual Memory for the iPSC/2 Hypercube

CONPAR '92/ VAPP V Proceedings of the Second Joint International Conference on Vector and Parallel Processing: Parallel Processing
Locality Enhancement for Large-Scale Shared-Memory Multiprocessors

LCR '98 Selected Papers from the 4th International Workshop on Languages, Compilers, and Run-Time Systems for Scalable Computers
Practical Loop Generation

HICSS '96 Proceedings of the 29th Hawaii International Conference on System Sciences Volume 1: Software Technology and Architecture
Integrating Loop and Data Transformations for Global Optimisation

PACT '98 Proceedings of the 1998 International Conference on Parallel Architectures and Compilation Techniques
A Compiler Algorithm to Reduce Invalidation Latency in Virtual Shared Memory Systems

PACT '96 Proceedings of the 1996 Conference on Parallel Architectures and Compilation Techniques
Validity of Interprocedural Data Remapping

Validity of Interprocedural Data Remapping

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper describes an automatic parallelising compiler, MARS, targeted for shared memory machines. It uses a data partitioning approach, traditionally used for distributed memory machines, in order to globally reduce overheads such as communication and synchronisation. Its high-level linear algebraic representation allows direct application of, for instance, unimodular transformations and global application of data transformation. Although a data based approach allows global analysis and in many instances outperforms local, loop-orientated parallelisation approaches, we have identified two particular problems when applying data parallelism to sequential Fortran 77 as opposed to data parallel dialects tailored to distributed memory targets. This paper describes two techniques to overcome these problems and evaluates their applicability. Preliminary results, on two SPECf92 benchmarks, show that with these optimisations, MARS outperforms existing state-of-the art loop based auto-parallelisation approaches.