More types for nested data parallel programming
ICFP '00 Proceedings of the fifth ACM SIGPLAN international conference on Functional programming
Data parallel Haskell: a status report
Proceedings of the 2007 workshop on Declarative aspects of multicore programming
Nested data-parallelism on the gpu
Proceedings of the 17th ACM SIGPLAN international conference on Functional programming
Work efficient higher-order vectorisation
Proceedings of the 17th ACM SIGPLAN international conference on Functional programming
Data-only flattening for nested data parallelism
Proceedings of the 18th ACM SIGPLAN symposium on Principles and practice of parallel programming
Towards a streaming model for nested data parallelism
Proceedings of the 2nd ACM SIGPLAN workshop on Functional high-performance computing
Hi-index | 0.00 |
An apply-to-all construct is the key mechanism for expressing data-parallelism, but data-parallel programming languages like HPF and C* significantly restrict which operations can appear in the construct. Allowing arbitrary operations substantially simplifies the expression of irregular and nested data-parallel computations. The technique of flattening nested parallelism introduced by Blelloch, compiles data-parallel programs with unrestricted apply-to-all constructs into vector operations, and has achieved notable success, particularly with irregular data-parallel programs. However, these programs must be carefully constructed so that flattening them does not lead to suboptimal work complexity due to unnecessary replication in index operations. We present new flattening transformations that generate programs with correct work complexity. Because these transformations may introduce concurrent reads in parallel indexing, we developed a randomized indexing that reduces concurrent reads while maintaining work-efficiency. Experimental results show that the new rules and implementations significantly reduce memory usage and improve performance.