Algorithmic approach to designing an easy-to-program system: Can it lead to a HW-enhanced programmer's workflow add-on?

Authors:
Uzi Vishkin
Affiliations:
University of Maryland Institute for Advanced Computer Studies
Venue:
ICCD'09 Proceedings of the 2009 IEEE international conference on Computer design
Year:
2009

Citing 15
Cited 1

An O(n2 log n) parallel max-flow algorithm

Journal of Algorithms
An introduction to parallel algorithms

An introduction to parallel algorithms
LogP: towards a realistic model of parallel computation

PPOPP '93 Proceedings of the fourth ACM SIGPLAN symposium on Principles and practice of parallel programming
The Tera computer system

ICS '90 Proceedings of the 4th international conference on Supercomputing
From algorithm parallelism to instruction-level parallelism: an encode-decode chain using prefix-sum

Proceedings of the ninth annual ACM symposium on Parallel algorithms and architectures
Explicit multi-threading (XMT) bridging models for instruction parallelism (extended abstract)

Proceedings of the tenth annual ACM symposium on Parallel algorithms and architectures
Towards a first vertical prototyping of an extremely fine-grained parallel programming approach

Proceedings of the thirteenth annual ACM symposium on Parallel algorithms and architectures
Parallel Computer Architecture: A Hardware/Software Approach

Parallel Computer Architecture: A Hardware/Software Approach
Practical Pram Programming

Practical Pram Programming
HPP: A High Performance PRAM

Euro-Par '96 Proceedings of the Second International Euro-Par Conference on Parallel Processing-Volume II
Layout-Accurate Design and Implementation of a High-Throughput Interconnection Network for Single-Chip Parallel Processing

HOTI '07 Proceedings of the 15th Annual IEEE Symposium on High-Performance Interconnects
Fpga-based prototype of a pram-on-chip processor

Proceedings of the 5th conference on Computing frontiers
Case study of gate-level logic simulation on an extremely fine-grained chip multiprocessor

Journal of Embedded Computing - Issues in embedded single-chip multicore architectures
An area-efficient high-throughput hybrid interconnection network for single-chip parallel processing

Proceedings of the 45th annual Design Automation Conference
Brief announcement: performance potential of an easy-to-program PRAM-on-chip prototype versus state-of-the-art processor

Proceedings of the twenty-first annual symposium on Parallelism in algorithms and architectures

Using simple abstraction to reinvent computing for parallelism

Communications of the ACM

Quantified Score

Hi-index	0.02

Visualization

Abstract

Our earlier parallel algorithmics work on the parallel random-access-machine/model (PRAM) computation model led us to a PRAM-On-Chip vision: a comprehensive many-core system that can look to the programmer like the abstract PRAM model. We introduced the eXplicit Multi-Threaded (XMT) design and prototyped it in hardware and software. XMT comprises a programmer's workflow that advances from work-depth, a standard PRAM theory abstraction, to an XMT program, and, if desired, to its performance tuning. XMT provides strong performance for programs developed this way due to its hardware support of very fine-grained threads and the overhead of handling them. XMT has also shown unique promise when it comes to ease-of-programming, the biggest problem that has limited the impact of all parallel systems to date. For example, teachability of XMT programming has been demonstrated at various levels from rising 6th graders to graduate students, and students in a freshman class were able to program 3 parallel sorting algorithms. The main purpose of the current paper is to stimulate discussion on the following somewhat open-ended question. Now that we made significant progress on a system devoted to supporting PRAM-like programming, is it possible to incorporate our hardware support as an add-on into other current and future many-core systems? The paper considers a concrete proposal for doing that: recasting our work as a hardware-enhanced programmer's workflow "module" that can then be essentially imported into the other systems.