Where is time spent in message-passing and shared-memory programs?

Authors:
Satish Chandra;James R. Larus;Anne Rogers
Affiliations:
Computer Sciences Department, University of Wisconsin-Madison, 1210 West Dayton Street, Madison, WI, USA;Computer Sciences Department, University of Wisconsin-Madison, 1210 West Dayton Street, Madison, WI, USA;Department of Computer Science, 35 Olden Street, Princeton University, Princeton, NJ, USA
Venue:
ASPLOS VI Proceedings of the sixth international conference on Architectural support for programming languages and operating systems
Year:
1994

Citing 16
Cited 18

An evaluation of directory schemes for cache coherence

ISCA '88 Proceedings of the 15th Annual International Symposium on Computer architecture
Multi-sweep asynchronous parallel successive over-relaxation for the nonsymmetric linear complementarity problem

Annals of Operations Research
Algorithms for scalable synchronization on shared-memory multiprocessors

ACM Transactions on Computer Systems (TOCS)
Active messages: a mechanism for integrated communication and computation

ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
LogP: towards a realistic model of parallel computation

PPOPP '93 Proceedings of the fourth ACM SIGPLAN symposium on Principles and practice of parallel programming
Integrating message-passing and shared-memory: early experience

PPOPP '93 Proceedings of the fourth ACM SIGPLAN symposium on Principles and practice of parallel programming
Cooperative shared memory: software and hardware for scalable multiprocessors

ACM Transactions on Computer Systems (TOCS)
The CM-5 Connection Machine: a scalable supercomputer

Communications of the ACM
The Wisconsin Wind Tunnel: virtual prototyping of parallel computers

SIGMETRICS '93 Proceedings of the 1993 ACM SIGMETRICS conference on Measurement and modeling of computer systems
Parallel programming in Split-C

Proceedings of the 1993 ACM/IEEE conference on Supercomputing
A distributed memory LAPSE: parallel simulation of message-passing programs

PADS '94 Proceedings of the eighth workshop on Parallel and distributed simulation
A comparison of message passing and shared memory architectures for data parallel programs

ISCA '94 Proceedings of the 21st annual international symposium on Computer architecture
Virtual memory mapped network interface for the SHRIMP multicomputer

ISCA '94 Proceedings of the 21st annual international symposium on Computer architecture
Tempest and typhoon: user-level shared memory

ISCA '94 Proceedings of the 21st annual international symposium on Computer architecture
Application-specific protocols for user-level shared memory

Proceedings of the 1994 ACM/IEEE conference on Supercomputing
The Performance Advantages of Integrating Message Passing in Cache-Coherent Multiprocessors

The Performance Advantages of Integrating Message Passing in Cache-Coherent Multiprocessors

CRL: high-performance all-software distributed shared memory

SOSP '95 Proceedings of the fifteenth ACM symposium on Operating systems principles
Teapot: language support for writing memory coherence protocols

PLDI '96 Proceedings of the ACM SIGPLAN 1996 conference on Programming language design and implementation
Coherent network interfaces for fine-grain communication

ISCA '96 Proceedings of the 23rd annual international symposium on Computer architecture
The GLOW cache coherence protocol extensions for widely shared data

ICS '96 Proceedings of the 10th international conference on Supercomputing
Evaluating the limits of message passing via the shared attraction memory on CC-COMA machines: experiences with TCGMSG and PVM

ICS '96 Proceedings of the 10th international conference on Supercomputing
Performance implications of communication mechanisms in all-software global address space systems

PPOPP '97 Proceedings of the sixth ACM SIGPLAN symposium on Principles and practice of parallel programming
Increasing TLB reach using superpages backed by shadow memory

Proceedings of the 25th annual international symposium on Computer architecture
A comparison of MPI, SHMEM and cache-coherent shared address space programming models on the SGI Origin2000

ICS '99 Proceedings of the 13th international conference on Supercomputing
Compiler-directed shared-memory communication for iterative parallel applications

Supercomputing '96 Proceedings of the 1996 ACM/IEEE conference on Supercomputing
A framework for performance-based program partitioning

Progress in computer research
A framework for performance-based program partitioning

Progress in computer research
A Comparison of MPI, SHMEM and Cache-Coherent Shared Address Space Programming Models on a Tightly-Coupled Multiprocessors

International Journal of Parallel Programming
Application Performance on the MIT Alewife Machine

Computer
How Can We Design Better Networks for DSM Systems?

PCRCW '97 Proceedings of the Second International Workshop on Parallel Computer Routing and Communication
Adaptive Proxies: Handling Widely-Shared Data in Shared-Memory Multiprocessors (Research Note)

Euro-Par '00 Proceedings from the 6th International Euro-Par Conference on Parallel Processing
Exploring the energy efficiency of cache coherence protocols in single-chip multi-processors

GLSVLSI '05 Proceedings of the 15th ACM Great Lakes symposium on VLSI
Energy-Efficient Multiprocessor Systems-on-Chip for Embedded Computing: Exploring Programming Models and Their Architectural Support

IEEE Transactions on Computers
Asymmetric Cache Coherency: Policy Modifications to Improve Multicore Performance

ACM Transactions on Reconfigurable Technology and Systems (TRETS)

Quantified Score

Hi-index	0.00

Visualization

Abstract

Message passing and shared memory are two techniques parallel programs use for coordination and communication. This paper studies the strengths and weaknesses of these two mechanisms by comparing equivalent, well-written message-passing and shared-memory programs running on similar hardware. To ensure that our measurements are comparable, we produced two carefully tuned versions of each program and measured them on closely-related simulators of a message-passing and a shared-memory machine, both of which are based on same underlying hardware assumptions.We examined the behavior and performance of each program carefully. Although the cost of computation in each pair of programs was similar, synchronization and communication differed greatly. We found that message-passing's advantage over shared-memory is not clear-cut. Three of the four shared-memory programs ran at roughly the same speed as their message-passing equivalent, even though their communication patterns were different.