Evaluating the XMT Parallel Programming Model

Authors:
Dorit Naishlos;Joseph Nuzman;Chau-Wen Tseng;Uzi Vishkin
Affiliations:
-;-;-;-
Venue:
HIPS '01 Proceedings of the 6th International Workshop on High-Level Parallel Programming Models and Supportive Environments
Year:
2001

Citing 13
Cited 0

A comparison of sorting algorithms for the connection machine CM-2

SPAA '91 Proceedings of the third annual ACM symposium on Parallel algorithms and architectures
SUIF: an infrastructure for research on parallelizing and optimizing compilers

ACM SIGPLAN Notices
The multiscalar architecture

The multiscalar architecture
Explicit multi-threading (XMT) bridging models for instruction parallelism (extended abstract)

Proceedings of the tenth annual ACM symposium on Parallel algorithms and architectures
The implementation of the Cilk-5 multithreaded language

PLDI '98 Proceedings of the ACM SIGPLAN 1998 conference on Programming language design and implementation
A no-busy-wait balanced tree parallel algorithmic paradigm

Proceedings of the twelfth annual ACM symposium on Parallel algorithms and architectures
Performance of hybrid message-passing and shared-memory parallelism for discrete element modeling

Proceedings of the 2000 ACM/IEEE conference on Supercomputing
A comparison of three programming models for adaptive applications on the Origin2000

Proceedings of the 2000 ACM/IEEE conference on Supercomputing
MPI versus MPI+OpenMP on IBM SP for the NAS benchmarks

Proceedings of the 2000 ACM/IEEE conference on Supercomputing
A comparative study of the NAS MG benchmark across parallel languages and architectures

Proceedings of the 2000 ACM/IEEE conference on Supercomputing
A Single-Chip Multiprocessor

Computer
Baring It All to Software: Raw Machines

Computer
Supporting Fine-Grained Synchronization on a Simultaneous Multithreading Processor

HPCA '99 Proceedings of the 5th International Symposium on High Performance Computer Architecture

Quantified Score

Hi-index	0.00

Visualization

Abstract

Explicit-multithreading (XMT) is a parallel programming model designed for exploiting on-chip parallelism. Its features include a simple thread execution model and an efficient prefix-sum instruction for synchronizing shared data accesses. By taking advantage of low-overhead parallel threads and high on-chip memory bandwidth, the XMT model tries to reduce the burden on programmers by obviating the need for explicit task assignment and thread coarsening. This paper presents features of the XMT programming model, and evaluates their utility through experiments on a prototype XMT compiler and architecture simulator. We find the lack of explicit task assignment has slight effects on performance for the XMT architecture. Despite low thread overhead, thread coarsening is still necessary to some extent, but can usually be automatically applied by the XMT compiler. The prefix-sum instruction provides more scalable synchronization than traditional locks, and the simple run-untilcompletion thread execution model (no busy-waits) does not impair performance. Finally, the combination of features in XMT can encourage simpler parallel algorithms that may be more efficient than more traditional complex approaches.