Wait-Free message passing protocol for non-coherent shared memory architectures

Authors:
Isaías A. Comprés Ureña;Michael Gerndt;Carsten Trinitis
Affiliations:
Institute of Informatics, Technical University of Munich (TUM), Garching, Germany;Institute of Informatics, Technical University of Munich (TUM), Garching, Germany;Institute of Informatics, Technical University of Munich (TUM), Garching, Germany
Venue:
EuroMPI'12 Proceedings of the 19th European conference on Recent Advances in the Message Passing Interface
Year:
2012

Citing 9
Cited 0

Architectural requirements and scalability of the NAS parallel benchmarks

SC '99 Proceedings of the 1999 ACM/IEEE conference on Supercomputing
Cache-Efficient, Intranode, Large-Message MPI Communication with MPICH2-Nemesis

ICPP '09 Proceedings of the 2009 International Conference on Parallel Processing
The 48-core SCC Processor: the Programmer's View

Proceedings of the 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis
Effective Performance Measurement at Petascale Using IPM

ICPADS '10 Proceedings of the 2010 IEEE 16th International Conference on Parallel and Distributed Systems
"Single-chip cloud computer", an IA tera-scale research processor

Euro-Par 2010 Proceedings of the 2010 conference on Parallel processing
RCKMPI - lightweight MPI implementation for intel's single-chip cloud computer (SCC)

EuroMPI'11 Proceedings of the 18th European MPI Users' Group conference on Recent advances in the message passing interface
Performance tuning of SCC-MPICH by means of the proposed MPI-3.0 tool interface

EuroMPI'11 Proceedings of the 18th European MPI Users' Group conference on Recent advances in the message passing interface
Invasive MPI on intel's single-chip cloud computer

ARCS'12 Proceedings of the 25th international conference on Architecture of Computing Systems
X10 on the single-chip cloud computer: porting and preliminary performance

Proceedings of the 2011 ACM SIGPLAN X10 Workshop

Quantified Score

Hi-index	0.00

Visualization

Abstract

The number of cores in future CPUs is expected to increase steadily. Balanced CPU designs scale hardware cache coherency functionality according to the number of cores, in order to minimize bottlenecks in parallel applications. An alternative approach is to do away with hardware coherence entirely; the Single-chip Cloud Computer (SCC), a 48 core experimental processor from Intel labs, does exactly that. A wait-free protocol for message passing on non-coherent buffers was introduced with the RCKMPI library, in order to support MPI on the SCC. In this work, the message passing performance of the protocol is modeled. Additionally, a port for symmetric multi-processors is introduced and used for comparison with MPICH2-Nemesis and Open MPI. Performance is analyzed based on statistics collected on a 4-dimensional space composed of source rank, target rank, message size and frequency.