Towards a streaming model for nested data parallelism

Authors:
Frederik M. Madsen;Andrzej Filinski
Affiliations:
University of Copenhagen, Copenhagen, Denmark;University of Copenhagen, Copenhagen, Denmark
Venue:
Proceedings of the 2nd ACM SIGPLAN workshop on Functional high-performance computing
Year:
2013

Citing 12
Cited 0

Implementing the multiprefix operation on parallel and vector computers

SPAA '93 Proceedings of the fifth annual ACM symposium on Parallel algorithms and architectures
Implementation of a portable nested data-parallel language

Journal of Parallel and Distributed Computing - Special issue on data parallel algorithms and programming
A provable time and space efficient implementation of NESL

Proceedings of the first ACM SIGPLAN international conference on Functional programming
Piecewise Execution of Nested Data-Parallel Programs

LCPC '95 Proceedings of the 8th International Workshop on Languages and Compilers for Parallel Computing
Work-efficient nested data-parallelism

FRONTIERS '95 Proceedings of the Fifth Symposium on the Frontiers of Massively Parallel Computation (Frontiers'95)
NESL: A Nested Data-Parallel Language

NESL: A Nested Data-Parallel Language
Single Assignment C: efficient support for high-level array operations in a functional setting

Journal of Functional Programming
Space profiling for parallel functional programs

Proceedings of the 13th ACM SIGPLAN international conference on Functional programming
Accelerating Haskell array codes with multicore GPUs

Proceedings of the sixth workshop on Declarative aspects of multicore programming
Nested data-parallelism on the gpu

Proceedings of the 17th ACM SIGPLAN international conference on Functional programming
Work efficient higher-order vectorisation

Proceedings of the 17th ACM SIGPLAN international conference on Functional programming
CuNesl: Compiling Nested Data-Parallel Languages for SIMT Architectures

ICPP '12 Proceedings of the 2012 41st International Conference on Parallel Processing

Quantified Score

Hi-index	0.00

Visualization

Abstract

The language-integrated cost semantics for nested data parallelism pioneered by NESL provides an intuitive, high-level model for predicting performance and scalability of parallel algorithms with reasonable accuracy. However, this predictability, obtained through a uniform, parallelism-flattening execution strategy, comes at the price of potentially prohibitive space usage in the common case of computations with an excess of available parallelism, such as dense-matrix multiplication. We present a simple nested data-parallel functional language and associated cost semantics that retains NESL's intuitive work--depth model for time complexity, but also allows highly parallel computations to be expressed in a space-efficient way, in the sense that memory usage on a single (or a few) processors is of the same order as for a sequential formulation of the algorithm, and in general scales smoothly with the actually realized degree of parallelism, not the potential parallelism. The refined semantics is based on distinguishing formally between fully materialized (i.e., explicitly allocated in memory all at once) "vectors" and potentially ephemeral "sequences" of values, with the latter being bulk-processable in a streaming fashion. This semantics is directly compatible with previously proposed piecewise execution models for nested data parallelism, but allows the expected space usage to be reasoned about directly at the source-language level. The language definition and implementation are still very much work in progress, but we do present some preliminary examples and timings, suggesting that the streaming model has practical potential.