A parallel SML compiler based on algorithmic skeletons

Authors:
Norman Scaife;Susumi Horiguchi;Greg Michaelson;Paul Bristow
Affiliations:
School of Information Science, Japan Advanced Institute of Science and Technology, 1-1 Asahidai, Tatsunokuchi, Nomigun, Ishikawa 923-1292, Japan (e-mail: norman@jaist.ac.jp, hori@jaist.ac.jp);School of Information Science, Japan Advanced Institute of Science and Technology, 1-1 Asahidai, Tatsunokuchi, Nomigun, Ishikawa 923-1292, Japan (e-mail: norman@jaist.ac.jp, hori@jaist.ac.jp);Department of Computing and Electrical Engineering, Heriot-Watt University, Riccarton, Edinburgh, EH14 4AS, United Kingdom (e-mail: greg@macs.hw.ac.uk, paul@macs.hw.ac.uk);Department of Computing and Electrical Engineering, Heriot-Watt University, Riccarton, Edinburgh, EH14 4AS, United Kingdom (e-mail: greg@macs.hw.ac.uk, paul@macs.hw.ac.uk)
Venue:
Journal of Functional Programming
Year:
2005

Citing 24
Cited 11

Lambda lifting: transforming programs to recursive equations

Proc. of a conference on Functional programming languages and computer architecture
Computer simulation methods: in theoretical physics

Computer simulation methods: in theoretical physics
Algorithmic skeletons: structured management of parallel computation

Algorithmic skeletons: structured management of parallel computation
Implementing ML on distributed memory multiprocessors

ACM SIGPLAN Notices - Workshop on languages, compilers and run-time environments for distributed memory multiprocessors
General purpose parallel computing

Lectures on parallel computation
Structured parallel programming: theory meets practice

Computing tomorrow
A higher-order removal method

Lisp and Symbolic Computation
Algebra of programming

Algebra of programming
Type-driven defunctionalization

ICFP '97 Proceedings of the second ACM SIGPLAN international conference on Functional programming
Structured development of parallel programs

Structured development of parallel programs
Models and languages for parallel computation

ACM Computing Surveys (CSUR)
Some Complexity Results for Matrix Computations on Parallel Processors

Journal of the ACM (JACM)
Pict: a programming language based on the Pi-Calculus

Proof, language, and interaction
Can programming be liberated from the von Neumann style?: a functional style and its algebra of programs

Communications of the ACM
Research Directions in Parallel Functional Programming

Research Directions in Parallel Functional Programming
The Definition of Standard ML

The Definition of Standard ML
ScaLAPACK: A Portable Linear Algebra Library for Distributed Memory Computers - Design Issues and Performance

PARA '95 Proceedings of the Second International Workshop on Applied Parallel Computing, Computations in Physics, Chemistry and Engineering Science
Load Balancing in Parallel Molecular Dynamics

IRREGULAR '98 Proceedings of the 5th International Symposium on Solving Irregularly Structured Problems in Parallel
Comparative Cross-Platform Performance Results from a Parallelizing SML Compiler

IFL '02 Selected Papers from the 13th International Workshop on Implementation of Functional Languages
Definitional interpreters for higher-order programming languages

ACM '72 Proceedings of the ACM annual conference - Volume 2
Domain decomposition scheme for parallel molecular dynamics simulation

HPC-ASIA '97 Proceedings of the High-Performance Computing on the Information Superhighway, HPC-Asia '97
Higher Order Function Synthesis Through Proof Planning

Proceedings of the 16th IEEE international conference on Automated software engineering
Comparing Parallel Functional Languages: Programming and Performance

Higher-Order and Symbolic Computation
Algorithm + strategy = parallelism

Journal of Functional Programming

Observing intermediate structures in a parallel lazy functional language

Proceedings of the 9th ACM SIGPLAN international conference on Principles and practice of declarative programming
A Debugger for Parallel Haskell Dialects

ICA3PP '08 Proceedings of the 8th international conference on Algorithms and Architectures for Parallel Processing
Lessons from implementing the biCGStab method with SkeTo library

Proceedings of the fourth international workshop on High-level parallel programming and applications
Seq no more: better strategies for parallel Haskell

Proceedings of the third ACM Haskell symposium on Haskell
Implementing fusion-equipped parallel skeletons by expression templates

IFL'09 Proceedings of the 21st international conference on Implementation and application of functional languages
A parallel skeleton for genetic algorithms

IWANN'11 Proceedings of the 11th international conference on Artificial neural networks conference on Advances in computational intelligence - Volume Part II
Testing speculative work in a lazy/eager parallel functional language

LCPC'05 Proceedings of the 18th international conference on Languages and Compilers for Parallel Computing
Combining measurement and stochastic modelling to enhance scheduling decisions for a parallel mean value analysis algorithm

ICCS'06 Proceedings of the 6th international conference on Computational Science - Volume Part II
ParaForming: forming parallel haskell programs using novel refactoring techniques

TFP'11 Proceedings of the 12th international conference on Trends in Functional Programming
Automatic parallelization of canonical loops

Science of Computer Programming
Triolet: a programming system that unifies algorithmic skeleton interfaces for high-performance cluster computing

Proceedings of the 19th ACM SIGPLAN symposium on Principles and practice of parallel programming

Quantified Score

Hi-index	0.00

Visualization

Abstract

Algorithmic skeletons are abstractions from common patterns of parallel activity which offer a high degree of reusability for developers of parallel algorithms. Their close association with higher order functions (HOFs) makes functional languages, with their strong transformational properties, excellent vehicles for skeleton-based parallel program development. However, using HOFs in this way raises substantial problems of identification of useful HOFs within a given application and of resource allocation on target architectures. We present the design and implementation of a parallelising compiler for Standard ML which exploits parallelism in the familiar $map$ and $fold$ HOFs through skeletons for processor farms and processor trees, respectively. The compiler extracts parallelism automatically and is target architecture independant. HOF execution within a functional language can be nested in the sense that one HOF may be passed and evaluated during the execution of another HOF. We are able to exploit this by nesting our parallel skeletons in a processor topology which matches the structure of the Standard ML source. However, where HOF arguments result from partially applied functions, free variable bindings must be identified and communicated through the corresponding skeleton hierarchy to where those arguments are actually applied. We describe the analysis leading from input Standard ML through HOF instantiation and backend compilation to an executable parallel program. We also present an overview of the runtime system and the execution model. Finally, we give parallel performance figures for several example programs, of varying computational loads, on the Linux-based Beowulf, IBM SP/2, Fujitsu AP3000 and Sun StarCat 15000 MIMD parallel machines. These demonstrate good cross-platform consistency of parallel code behaviour.