Experiments with list ranking for explicit multi-threaded (XMT) instruction parallelism

Authors:
Dascal Vishkin;Uzi Vishkin
Affiliations:
Univ. of Maryland, College Park;Univ. of Maryland, College Park
Venue:
Journal of Experimental Algorithmics (JEA)
Year:
2000

Citing 18
Cited 2

Deterministic coin tossing and accelerating cascades: micro and macro techniques for designing parallel algorithms

STOC '86 Proceedings of the eighteenth annual ACM symposium on Theory of computing
Faster optimal parallel prefix sums and list ranking

Information and Computation
A simple randomized parallel algorithm for list-ranking

Information Processing Letters
Computer organization & design: the hardware/software interface

Computer organization & design: the hardware/software interface
List ranking and list scan on the Cray C-90

SPAA '94 Proceedings of the sixth annual ACM symposium on Parallel algorithms and architectures
Efficient massively parallel implementation of some combinatorial algorithms

Theoretical Computer Science
Better trade-offs for parallel list ranking

Proceedings of the ninth annual ACM symposium on Parallel algorithms and architectures
From algorithm parallelism to instruction-level parallelism: an encode-decode chain using prefix-sum

Proceedings of the ninth annual ACM symposium on Parallel algorithms and architectures
The art of computer programming, volume 2 (3rd ed.): seminumerical algorithms

The art of computer programming, volume 2 (3rd ed.): seminumerical algorithms
Explicit multi-threading (XMT) bridging models for instruction parallelism (extended abstract)

Proceedings of the tenth annual ACM symposium on Parallel algorithms and architectures
Computer architecture (2nd ed.): a quantitative approach

Computer architecture (2nd ed.): a quantitative approach
Practical parallel list ranking

Journal of Parallel and Distributed Computing
A no-busy-wait balanced tree parallel algorithmic paradigm

Proceedings of the twelfth annual ACM symposium on Parallel algorithms and architectures
Parallel Computer Architecture: A Hardware/Software Approach

Parallel Computer Architecture: A Hardware/Software Approach
Ultimate Parallel List Ranking?

HiPC '99 Proceedings of the 6th International Conference on High Performance Computing
VLSI Architecture: Past, Present, and Future

ARVLSI '99 Proceedings of the 20th Anniversary Conference on Advanced Research in VLSI
Randomized speed-ups in parallel computation

STOC '84 Proceedings of the sixteenth annual ACM symposium on Theory of computing
A Simple Optimal List Ranking Algorithm

HIPC '98 Proceedings of the Fifth International Conference on High Performance Computing

Towards a first vertical prototyping of an extremely fine-grained parallel programming approach

Proceedings of the thirteenth annual ACM symposium on Parallel algorithms and architectures
Better speedups using simpler parallel programming for graph connectivity and biconnectivity

Proceedings of the 2012 International Workshop on Programming Models and Applications for Multicores and Manycores

Quantified Score

Hi-index	0.00

Visualization

Abstract

Algorithms for the problem of list ranking are empiricallystudied with respect to the Explicit Multi-Threaded (XMT) platformfor instruction-level parallelism (ILP). The main goal of thisstudy is to understand the differences between XMT and moretraditional parallel computing implementation platforms/models asthey pertain to the well studied list ranking problem. The main twofindings are: (i) good speedups for much smaller inputs arepossible and (ii) in part, the first finding is based on a newvariant of a 1984 algorithm, called the No-Cut algorithm. The paperincorporates analytic (non-asymptotic) performance analysis intoexperimental performance analysis for relatively small inputs. Thisprovides an interesting example where experimental research andtheoretical analysis complement one another. ExplicitMulti-Threading (XMT) is a fine-grained computation frameworkintroduced in our SPAA'98 paper. Building on some key ideas ofparallel computing, XMT covers the spectrum from algorithms througharchitecture to implementation; the main implementation relatedinnovation in XMT was through the incorporation of low-overheadhardware and software mechanisms (for more effective fine-grainedparallelism). The reader is referred to that paper for detail onthese mechanisms. The XMT platform aims at faster single-taskcompletion time by way of ILP.