Garbage collection in an uncooperative environment
Software—Practice & Experience
Deforestation: transforming programs to eliminate trees
Proceedings of the Second European Symposium on Programming
A report on the Sisal language project
Journal of Parallel and Distributed Computing - Special issue: data-flow processing
Implementation of a portable nested data-parallel language
Journal of Parallel and Distributed Computing - Special issue on data parallel algorithms and programming
TIL: a type-directed optimizing compiler for ML
PLDI '96 Proceedings of the ACM SIGPLAN 1996 conference on Programming language design and implementation
Proceedings of the sixth ACM SIGPLAN international conference on Functional programming
Compiling Haskell by Program Transformation: A Report from the Trenches
ESOP '96 Proceedings of the 6th European Symposium on Programming Languages and Systems
Maximizing Loop Parallelism and Improving Data Locality via Loop Fusion and Distribution
Proceedings of the 6th International Workshop on Languages and Compilers for Parallel Computing
Comparative Cross-Platform Performance Results from a Parallelizing SML Compiler
IFL '02 Selected Papers from the 13th International Workshop on Implementation of Functional Languages
Comparing Parallel Functional Languages: Programming and Performance
Higher-Order and Symbolic Computation
Single Assignment C: efficient support for high-level array operations in a functional setting
Journal of Functional Programming
Parallel functional programming in Eden
Journal of Functional Programming
A parallel SML compiler based on algorithmic skeletons
Journal of Functional Programming
Data parallel Haskell: a status report
Proceedings of the 2007 workshop on Declarative aspects of multicore programming
Stream fusion: from lists to streams to nothing at all
ICFP '07 Proceedings of the 12th ACM SIGPLAN international conference on Functional programming
Confessions of a used programming language salesman
Proceedings of the 22nd annual ACM SIGPLAN conference on Object-oriented programming systems and applications
Parallel Skeletons for Variable-Length Lists in SkeTo Skeleton Library
Euro-Par '09 Proceedings of the 15th International Euro-Par Conference on Parallel Processing
Regular, shape-polymorphic, parallel arrays in Haskell
Proceedings of the 15th ACM SIGPLAN international conference on Functional programming
A survey of algorithmic skeleton frameworks: high-level structured parallel programming enablers
Software—Practice & Experience - Focus on Selected PhD Literature Reviews in the Practical Aspects of Software Technology
Copperhead: compiling an embedded data parallel language
Proceedings of the 16th ACM symposium on Principles and practice of parallel programming
Implicitly threaded parallelism in manticore
Journal of Functional Programming
A generic parallel collection framework
Euro-Par'11 Proceedings of the 17th international conference on Parallel processing - Volume Part II
POPL '13 Proceedings of the 40th annual ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Hi-index | 0.00 |
Functional algorithmic skeletons promise a high-level programming interface for distributed-memory clusters that free developers from concerns of task decomposition, scheduling, and communication. Unfortunately, prior distributed functional skeleton frameworks do not deliver performance comparable to that achievable in a low-level distributed programming model such as C with MPI and OpenMP, even when used in concert with high-performance array libraries. There are several causes: they do not take advantage of shared memory on each cluster node; they impose a fixed partitioning strategy on input data; and they have limited ability to fuse loops involving skeletons that produce a variable number of outputs per input. We address these shortcomings in the Triolet programming language through a modular library design that separates concerns of parallelism, loop nesting, and data partitioning. We show how Triolet substantially improves the parallel performance of algorithms involving array traversals and nested, variable-size loops over what is achievable in Eden, a distributed variant of Haskell. We further demonstrate how Triolet can substantially simplify parallel programming relative to C with MPI and OpenMP while achieving 23--100% of its performance on a 128-core cluster.