Distributed filaments: efficient fine-grain parallelism on a cluster of workstations

Authors:
Vincent W. Freeh;David K. Lowenthal;Gregory R. Andrews
Affiliations:
Department of Computer Science, University of Arizona, Tucson, AZ;Department of Computer Science, University of Arizona, Tucson, AZ;Department of Computer Science, University of Arizona, Tucson, AZ
Venue:
OSDI '94 Proceedings of the 1st USENIX conference on Operating Systems Design and Implementation
Year:
1994

Citing 15
Cited 8

An overview of the SR language and implementation

ACM Transactions on Programming Languages and Systems (TOPLAS)
Two algorithms for barrier synchronization

International Journal of Parallel Programming
Mirage: a coherent distributed shared memory design

SOSP '89 Proceedings of the twelfth ACM symposium on Operating systems principles
Memory coherence in shared virtual memory systems

ACM Transactions on Computer Systems (TOCS)
The Performance Implications of Thread Management Alternatives for Shared-Memory Multiprocessors

IEEE Transactions on Computers
Implementation and performance of Munin

SOSP '91 Proceedings of the thirteenth ACM symposium on Operating systems principles
Active messages: a mechanism for integrated communication and computation

ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
Limits to low-latency communication on high-speed networks

ACM Transactions on Computer Systems (TOCS)
Chores: enhanced run-time support for shared-memory parallel computing

ACM Transactions on Computer Systems (TOCS)
TAM—a compiler controlled threaded abstract machine

Journal of Parallel and Distributed Computing - Special issue on dataflow and multithreaded architectures
Fine-grain access control for distributed shared memory

ASPLOS VI Proceedings of the sixth international conference on Architectural support for programming languages and operating systems
Distributed data structures in Linda

POPL '86 Proceedings of the 13th ACM SIGACT-SIGPLAN symposium on Principles of programming languages
Scaling Parallel Programs for Multiprocessors: Methodology and Examples

Computer
The Clouds Distributed Operating System

Computer
The distributed V kernel and its performance for diskless workstations

SOSP '83 Proceedings of the ninth ACM symposium on Operating systems principles

A comprehensive bibliography of distributed shared memory

ACM SIGOPS Operating Systems Review
An Adaptive Approach to Data Placement

IPPS '96 Proceedings of the 10th International Parallel Processing Symposium
Dag-Consistent Distributed Shared Memory

IPPS '96 Proceedings of the 10th International Parallel Processing Symposium
Compiler and Run-Time Support for Adaptive Load Balancing in Software Distributed Shared Memory Systems

LCR '98 Selected Papers from the 4th International Workshop on Languages, Compilers, and Run-Time Systems for Scalable Computers
Dynamically Controlling False Sharing in Distributed Shared Memory

HPDC '96 Proceedings of the 5th IEEE International Symposium on High Performance Distributed Computing
CRAUL: Compiler and run-time integration for adaptation under load[1]This work was supported in part by NSF grants CDA-9401142, CCR-9702466, and CCR-9705594; and an external research grant from Compaq.

Scientific Programming
Brief announcement: serial-parallel reciprocity in dynamic multithreaded languages

Proceedings of the twenty-second annual ACM symposium on Parallelism in algorithms and architectures
Using memory mapping to support cactus stacks in work-stealing runtime systems

Proceedings of the 19th international conference on Parallel architectures and compilation techniques

Quantified Score

Hi-index	0.00

Visualization

Abstract

A fine-grain parallel program is one in which processes are typically small ranging from a few to a few hundred instructions. Fine-grain parallelism arises naturally in many situations such as iterative grid computations recursive fork/join programs the bodies of parallel FOR loops and the implicit parallelism in functional or dataflow languages. It is useful both to describe massively parallel computations and as a target for code generation by compilers. However fine-grain parallelism has long been thought to be inefficient due to the overheads of process creation context switching, and synchronization. This paper describes a software kernel. Distributed Filaments (DF) that implements fine-grain parallelism both portably and efficiently on a workstation cluster DF runs on existing off-the-shelf hardware and software. It has a simple interface so it is easy to use. DF achieves e ciency by using stateless threads on each node overlapping communication and computation, employing a new reliable datagram communication protocol and automatically balancing the work generated by fork/join computations.