GRIP—A high-performance architecture for parallel graph reduction
Proc. of a conference on Functional programming languages and computer architecture
Parallel implementations of functional programming languages
The Computer Journal - Special issue on Lazy functional programming
Multiprocessor execution of functional programs
International Journal of Parallel Programming
The HDG-machine: a highly distributed graph-reducer for a transputer network
The Computer Journal - Special issue: Concurrent programming
Algorithmic skeletons: structured management of parallel computation
Algorithmic skeletons: structured management of parallel computation
GUM: a portable parallel implementation of Haskell
PLDI '96 Proceedings of the ACM SIGPLAN 1996 conference on Programming language design and implementation
Lazy Task Creation: A Technique for Increasing the Granularity of Parallel Programs
IEEE Transactions on Parallel and Distributed Systems
Experience with the Implementation of a Concurrent Graph Reduction System on an nCube/2 Platform
CONPAR 94 - VAPP VI Proceedings of the Third Joint International Conference on Vector and Parallel Processing: Parallel Processing
Making a Packet: Cost-Effective Communication for a Parallel Graph Reducer
IFL '96 Selected Papers from the 8th International Workshop on Implementation of Functional Languages
Executing functional programs on a virtual tree of processors
FPCA '81 Proceedings of the 1981 conference on Functional programming languages and computer architecture
The Eden Coordination Model for Distributed Memory Systems
HIPS '97 Proceedings of the 1997 Workshop on High-Level Programming Models and Supportive Environments (HIPS '97)
Algorithm + strategy = parallelism
Journal of Functional Programming
Hi-index | 0.00 |
Parallel graph reducers such as GUM use dynamic techniques to manage resources during execution. One important aspect of the dynamic behaviour is the distribution of work. The load balancing mechanism, which controls this aspect, should be flexible, to adjust the distribution of work to hardware characteristics as well as dynamic program characteristics, and scalable, to achieve high utilisation of all processors even on massively parallel machines.In this paper we study the behaviour of GUM's load balancing mechanism on a high-latency Beowulf multi-processor. We present modifications to the basic load balancing mechanism and discuss runtime measurements, which indicate that these modifications can significantly enhance the scalability of the system.