Efficient support for pipelining in software distributed shared memory systems

Authors:
Karthik Balasubramanian;David K. Lowenthal
Affiliations:
Department of Computer Science, University of Georgia;Department of Computer Science, University of Georgia
Venue:
Real-time system security
Year:
2003

Citing 20
Cited 0

Automatic translation of FORTRAN programs to vector form

ACM Transactions on Programming Languages and Systems (TOPLAS)
Memory coherence in shared virtual memory systems

ACM Transactions on Computer Systems (TOCS)
Implementation and performance of Munin

SOSP '91 Proceedings of the thirteenth ACM symposium on Operating systems principles
Memory consistency models

ACM SIGOPS Operating Systems Review
Message passing versus distributed shared memory on networks of workstations

Supercomputing '95 Proceedings of the 1995 ACM/IEEE conference on Supercomputing
Scope consistency: a bridge between release consistency and entry consistency

Proceedings of the eighth annual ACM symposium on Parallel algorithms and architectures
The effect of interrupts on software pipeline execution on message-passing architectures

ICS '96 Proceedings of the 10th international conference on Supercomputing
Using fine-grain threads and run-time decision making in parallel computing

Journal of Parallel and Distributed Computing - Special issue on multithreading for multiprocessors
Compiler and software distributed shared memory support for irregular applications

PPOPP '97 Proceedings of the sixth ACM SIGPLAN symposium on Principles and practice of parallel programming
MultiView and Millipage — fine-grain sharing in page-based DSMs

OSDI '99 Proceedings of the third symposium on Operating systems design and implementation
Program Improvement by Source-to-Source Transformation

Journal of the ACM (JACM)
Accurately Selecting Block Size at Runtime in Pipelined Parallel Programs

International Journal of Parallel Programming
The Paradigm Compiler for Distributed-Memory Multicomputers

Computer
Maximizing Multiprocessor Performance with the SUIF Compiler

Computer
Evaluating the Performance of Software Distributed Shared Memory as a Target for Parallelizing Compilers

IPPS '97 Proceedings of the 11th International Symposium on Parallel Processing
Enhancing Software DSM for Compiler-Parallelized Applications

IPPS '97 Proceedings of the 11th International Symposium on Parallel Processing
Run-Time Selection of Block Size in Pipelined Parallel Programs

IPPS '99/SPDP '99 Proceedings of the 13th International Symposium on Parallel Processing and the 10th Symposium on Parallel and Distributed Processing
Run-Time Parallelization of Irregular DOACROSS Loops

IRREGULAR '95 Proceedings of the Second International Workshop on Parallel Algorithms for Irregularly Structured Problems
Improving Release-Consistent Shared Virtual Memory using Automatic Update

HPCA '96 Proceedings of the 2nd IEEE Symposium on High-Performance Computer Architecture
Dynamically Controlling False Sharing in Distributed Shared Memory

HPDC '96 Proceedings of the 5th IEEE International Symposium on High Performance Distributed Computing

Quantified Score

Hi-index	0.00

Visualization

Abstract

Though more difficult to program, distributed-memory parallel machines provide greater scalability than their shared-memory counterparts. Software Distributed Shared Memory (SDSM) systems provide the abstraction of shared memory on a distributed machine. While SDSMs provide an attractive programming model, they currently can not efficiently support all classes of scientific applications. One such class are those with recurrences that cause dependencies across processors or nodes. A popular solution to such problems is to use pipelining, which breaks the computation into blocks; each processor performs the computation of a block, which enables the next processor in the pipeline to compute its corresponding block. Once the pipeline is filled, the computation of blocks proceeds in parallel. While pipelining is useful, it is not efficiently supported by current SDSM systems.This paper presents an approach to integrating pipelining into SDSM systems. We describe our design and implementation of one-way pipelining in a SDSM. The key idea is to retain the shared-memory model, but design the extensions such that the execution will mimic what would be done in an explicit message-passing program. We show that one-way pipelining is superior to the two most common ways to program pipelined applications, which are distributed locks and explicit matrix transposition. Finally, we show that one-way pipelining is competitive with a hand-coded, explicit message-passing program.