Evaluation of a virtual shared memory machine by the compilation of data parallel loops

Authors:
Fabrizio Baiardi;Davide Guerri;Paolo Mori;Laura Ricci
Affiliations:
Dipartimento di Informatica, Università di Pisa;Dipartimento di Informatica, Università di Pisa;Dipartimento di Informatica, Università di Pisa;Dipartimento di Informatica, Università di Pisa
Venue:
EURO-PDP'00 Proceedings of the 8th Euromicro conference on Parallel and distributed processing
Year:
2000

Citing 8
Cited 0

Generating communication for array statements: design, implementation, and evaluation

Journal of Parallel and Distributed Computing - Special issue on data parallel algorithms and programming
Compiler optimizations for eliminating barrier synchronization

PPOPP '95 Proceedings of the fifth ACM SIGPLAN symposium on Principles and practice of parallel programming
TreadMarks: Shared Memory Computing on Networks of Workstations

Computer
Compiling array expressions for efficient execution on distributed-memory machines

Journal of Parallel and Distributed Computing
Optimizing communication in HPF programs on fine-grain distributed shared memory

PPOPP '97 Proceedings of the sixth ACM SIGPLAN symposium on Principles and practice of parallel programming
VM-based shared memory on low-latency, remote-memory-access networks

Proceedings of the 24th annual international symposium on Computer architecture
Heterogeneous HPC Environments

Euro-Par '98 Proceedings of the 4th International Euro-Par Conference on Parallel Processing
State of the Art in Compiling HPF

The Data Parallel Programming Model: Foundations, HPF Realization, and Scientific Applications

Quantified Score

Hi-index	0.00

Visualization

Abstract

We introduce DVSA, distributed virtual shared areas, a virtual machine supporting the sharing of information on distributed memory architectures. The shared memory is structured as a set of areas where the size of each area may be chosen in an architecture dependent range. DVSA supports the sharing of areas rather than of variables because the exchange of chunks of data may result in better performances on distributed memory architectures offering little or no hardware support to information sharing. DVSA does not implement replication or prefetching strategies under the assumption that these strategies should be implemented by application specific virtual machines. The definition of these machines may often be driven by the compilation of the adopted programming languages. To validate the assumption, at first we consider the implementation of data parallel loops and show that a set of static analyses based on the closed forms approach makes it possible to define compiler driven caching and prefetching strategies. These strategies fully exploit the operations offered by the DVSA machine and they noticeably reduce the time to access shared information. The optimizations strategies that can be exploited by the compiler includes the merging of accesses to avoid multiple access to the same area, the prefetching of areas and the reduction of the overhead due to barrier synchronization. Preliminary performance figures are discussed.