Generating communication for array statements: design, implementation, and evaluation
Journal of Parallel and Distributed Computing - Special issue on data parallel algorithms and programming
Compiler optimizations for eliminating barrier synchronization
PPOPP '95 Proceedings of the fifth ACM SIGPLAN symposium on Principles and practice of parallel programming
Compiling array expressions for efficient execution on distributed-memory machines
Journal of Parallel and Distributed Computing
Optimizing communication in HPF programs on fine-grain distributed shared memory
PPOPP '97 Proceedings of the sixth ACM SIGPLAN symposium on Principles and practice of parallel programming
VM-based shared memory on low-latency, remote-memory-access networks
Proceedings of the 24th annual international symposium on Computer architecture
Heterogeneous HPC Environments
Euro-Par '98 Proceedings of the 4th International Euro-Par Conference on Parallel Processing
State of the Art in Compiling HPF
The Data Parallel Programming Model: Foundations, HPF Realization, and Scientific Applications
Hi-index | 0.00 |
We introduce DVSA, distributed virtual shared areas, a virtual machine supporting the sharing of information on distributed memory architectures. The shared memory is structured as a set of areas where the size of each area may be chosen in an architecture dependent range. DVSA supports the sharing of areas rather than of variables because the exchange of chunks of data may result in better performances on distributed memory architectures offering little or no hardware support to information sharing. DVSA does not implement replication or prefetching strategies under the assumption that these strategies should be implemented by application specific virtual machines. The definition of these machines may often be driven by the compilation of the adopted programming languages. To validate the assumption, at first we consider the implementation of data parallel loops and show that a set of static analyses based on the closed forms approach makes it possible to define compiler driven caching and prefetching strategies. These strategies fully exploit the operations offered by the DVSA machine and they noticeably reduce the time to access shared information. The optimizations strategies that can be exploited by the compiler includes the merging of accesses to avoid multiple access to the same area, the prefetching of areas and the reduction of the overhead due to barrier synchronization. Preliminary performance figures are discussed.