Low-pain, high-gain multicore programming in Haskell: coordinating irregular symbolic computations on multicore architectures

Authors:
Abdallah Deeb I. Al Zain;Kevin Hammond;Jost Berthold;Phil Trinder;Greg Michaelson;Mustafa Aswad
Affiliations:
Heriot-Watt University, Edinburgh, United Kingdom;University of St Andrews, St Andrews, United Kingdom;Philipps-Universitat, Marburg, Germany;Heriot-Watt University, Edinburgh, United Kingdom;Heriot-Watt University, Edinburgh, United Kingdom;Heriot-Watt University, Edinburgh, United Kingdom
Venue:
Proceedings of the 4th workshop on Declarative aspects of multicore programming
Year:
2009

Citing 36
Cited 2

GRIP—A high-performance architecture for parallel graph reduction

Proc. of a conference on Functional programming languages and computer architecture
Algorithmic skeletons: structured management of parallel computation

Algorithmic skeletons: structured management of parallel computation
STAR/MPI: binding a parallel library to interactive symbolic algebra systems

ISSAC '95 Proceedings of the 1995 international symposium on Symbolic and algebraic computation
Concurrent programming in ERLANG (2nd ed.)

Concurrent programming in ERLANG (2nd ed.)
GUM: a portable parallel implementation of Haskell

PLDI '96 Proceedings of the ACM SIGPLAN 1996 conference on Programming language design and implementation
A case study of multi-threaded Gröbner basis completion

ISSAC '96 Proceedings of the 1996 international symposium on Symbolic and algebraic computation
Maple on a massively parallel, distributed memory machine

PASCO '97 Proceedings of the second international symposium on Parallel symbolic computation
Multithreaded programming with Pthreads

Multithreaded programming with Pthreads
Parallel computer algebra (tutorial)

ISSAC '97 Proceedings of the 1997 international symposium on Symbolic and algebraic computation
The implementation of the Cilk-5 multithreaded language

PLDI '98 Proceedings of the ACM SIGPLAN 1998 conference on Programming language design and implementation
Research Directions in Parallel Functional Programming

Research Directions in Parallel Functional Programming
A stream compiler for communication-exposed architectures

Proceedings of the 10th international conference on Architectural support for programming languages and operating systems
Eden - The Paradise of Functional Concurrent Programming

Euro-Par '96 Proceedings of the Second International Euro-Par Conference on Parallel Processing - Volume I
The nofib Benchmark Suite of Haskell Programs

Proceedings of the 1992 Glasgow Workshop on Functional Programming
Language support for lightweight transactions

OOPSLA '03 Proceedings of the 18th annual ACM SIGPLAN conference on Object-oriented programing, systems, languages, and applications
Algorithm + strategy = parallelism

Journal of Functional Programming
Composable memory transactions

Proceedings of the tenth ACM SIGPLAN symposium on Principles and practice of parallel programming
Parallel functional programming in Eden

Journal of Functional Programming
Automatic thread distribution for nested parallelism in OpenMP

Proceedings of the 19th annual international conference on Supercomputing
Haskell on a shared-memory multiprocessor

Proceedings of the 2005 ACM SIGPLAN workshop on Haskell
Software and the Concurrency Revolution

Queue - Multiprocessors
The Atomos transactional programming language

Proceedings of the 2006 ACM SIGPLAN conference on Programming language design and implementation
Compiler and runtime support for efficient software transactional memory

Proceedings of the 2006 ACM SIGPLAN conference on Programming language design and implementation
What do high-level memory models mean for transactions?

Proceedings of the 2006 workshop on Memory system performance and correctness
Data parallel Haskell: a status report

Proceedings of the 2007 workshop on Declarative aspects of multicore programming
Manticore: a heterogeneous parallel language

Proceedings of the 2007 workshop on Declarative aspects of multicore programming
Feedback directed implicit parallelism

ICFP '07 Proceedings of the 12th ACM SIGPLAN international conference on Functional programming
Evaluating MapReduce for Multi-core and Multiprocessor Systems

HPCA '07 Proceedings of the 2007 IEEE 13th International Symposium on High Performance Computer Architecture
Google's MapReduce programming model – Revisited

Science of Computer Programming
MapReduce: simplified data processing on large clusters

Communications of the ACM - 50th anniversary issue: 1958 - 2008
Revisiting the Sequential Programming Model for Multi-Core

Proceedings of the 40th Annual IEEE/ACM International Symposium on Microarchitecture
Evaluating a High-Level Parallel Language (GpH) for Computational GRIDs

IEEE Transactions on Parallel and Distributed Systems
Using OpenMP: Portable Shared Memory Parallel Programming (Scientific and Engineering Computation)

Using OpenMP: Portable Shared Memory Parallel Programming (Scientific and Engineering Computation)
Parallelism without Pain: Orchestrating Computational Algebra Components into a High-Performance Parallel System

ISPA '08 Proceedings of the 2008 IEEE International Symposium on Parallel and Distributed Processing with Applications
Hierarchical master-worker skeletons

PADL'08 Proceedings of the 10th international conference on Practical aspects of declarative languages
Building an interface between eden and maple: a way of parallelizing computer algebra algorithms

IFL'03 Proceedings of the 15th international conference on Implementation of Functional Languages

The Peter Landin prize

Higher-Order and Symbolic Computation
Reliable scalable symbolic computation: the design of SymGridPar2

Proceedings of the 28th Annual ACM Symposium on Applied Computing

Quantified Score

Hi-index	0.00

Visualization

Abstract

With the emergence of commodity multicore architectures, exploiting tightly-coupled parallelism has become increasingly important. Functional programming languages, such as Haskell, are, in principle, well placed to take advantage of this trend, offering the ability to easily identify large amounts of fine-grained parallelism. Unfortunately, obtaining real performance benefits has often proved hard to realise in practice. This paper reports on a new approach using middleware that has been constructed using the Eden parallel dialect of Haskell. Our approach is ``low pain'' in the sense that the programmer constructs a parallel program by inserting a small number of higher-order algorithmic skeletons at key points in the program. It is ``high gain'' in the sense that we are able to get good parallel speedups. Our approach is unusual in that we do not attempt to use shared memory directly, but rather coordinate parallel computations using a message-passing implementation. This approach has a number of advantages. Firstly, coordination, i.e. locking and communication, is both confined to limited shared memory areas, essentially the communication buffers, and is also isolated within well-understood libraries. Secondly, the coarse thread granularity that we obtain reduces coordination overheads, so locks are normally needed only on (relatively large) messages, and not on individual data items, as is often the case for simple shared-memory implementations. Finally, cache coherency requirements are reduced since individual tasks do not share caches, and can garbage collect independently. We report results for two representative computational algebra problems. Computational algebra is a challenging application area that has not been widely studied in the general parallelism community. Computational algebra applications have high computational demands, and are, in principle, often suitable for parallel execution, but usually display a high degree of irregularity in terms of both task and data structure. This makes it difficult to construct parallel applications that perform well in practice. Using our system, we are able to obtain both extremely good processor utilisation (97%) and very good absolute speedups (up to 7.7) on an eight-core machine.