Actors: a model of concurrent computation in distributed systems
Actors: a model of concurrent computation in distributed systems
Mul-T: a high-performance parallel Lisp
PLDI '89 Proceedings of the ACM SIGPLAN 1989 Conference on Programming language design and implementation
Algorithmic skeletons: structured management of parallel computation
Algorithmic skeletons: structured management of parallel computation
Concurrent programming in ERLANG (2nd ed.)
Concurrent programming in ERLANG (2nd ed.)
Using fine-grain threads and run-time decision making in parallel computing
Journal of Parallel and Distributed Computing - Special issue on multithreading for multiprocessors
The implementation of the Cilk-5 multithreaded language
PLDI '98 Proceedings of the ACM SIGPLAN 1998 conference on Programming language design and implementation
Proceedings of the ACM 2000 conference on Java Grande
Functional Programming and Parallel Graph Rewriting
Functional Programming and Parallel Graph Rewriting
Lazy Task Creation: A Technique for Increasing the Granularity of Parallel Programs
IEEE Transactions on Parallel and Distributed Systems
Parallel and Distributed Haskells
Journal of Functional Programming
Algorithm + strategy = parallelism
Journal of Functional Programming
Parallel functional programming in Eden
Journal of Functional Programming
X10: an object-oriented approach to non-uniform cluster computing
OOPSLA '05 Proceedings of the 20th annual ACM SIGPLAN conference on Object-oriented programming, systems, languages, and applications
Parallel Programmability and the Chapel Language
International Journal of High Performance Computing Applications
MapReduce: simplified data processing on large clusters
Communications of the ACM - 50th anniversary issue: 1958 - 2008
Solving Large, Irregular Graph Problems Using Adaptive Work-Stealing
ICPP '08 Proceedings of the 2008 37th International Conference on Parallel Processing
Intel threading building blocks
Intel threading building blocks
Scala Actors: Unifying thread-based and event-based programming
Theoretical Computer Science
Runtime support for multicore Haskell
Proceedings of the 14th ACM SIGPLAN international conference on Functional programming
Proceedings of the 14th ACM SIGPLAN international conference on Functional programming
The design of a task parallel library
Proceedings of the 24th ACM SIGPLAN conference on Object oriented programming systems languages and applications
Efficient parallel programming in Poly/ML and Isabelle/ML
Proceedings of the 5th ACM SIGPLAN workshop on Declarative aspects of multicore programming
Seq no more: better strategies for parallel Haskell
Proceedings of the third ACM Haskell symposium on Haskell
Parallel Programming with Microsoft .NET: Design Patterns for Decomposition and Coordination on Multicore Architectures
A generic parallel collection framework
Euro-Par'11 Proceedings of the 17th international conference on Parallel processing - Volume Part II
Da capo con scala: design and analysis of a scala benchmark suite for the java virtual machine
Proceedings of the 2011 ACM international conference on Object oriented programming systems languages and applications
Hi-index | 0.00 |
This paper provides a performance and programmability comparison of high-level parallel programming support in Haskell, F# and Scala. Developing several parallel versions, we employ skeleton-based, semi-explicit and explicit approaches to parallelism. We focus on advanced language features for separating computational and coordination aspects of the code and tuning performance. We also assess the impact of functional purity and multi-paradigm design of the languages on program development and performance. Basis for these comparisons are several Barnes-Hut implementations of the n-body problem in all three languages, on both Linux and Windows. Our performance measurements on state-of-the-art multi-cores achieve a speedup up to 5.62 (on 8 cores) with a highly-tuned Haskell version. For comparable implementations in Scala and F# we achieve speedups of 4.51 (on 8 cores) and 2.28 (on 4 cores), respectively. We observe that near best speedups are achieved using the highest level abstraction in these languages.