In search of clusters: the coming battle in lowly parallel computing
In search of clusters: the coming battle in lowly parallel computing
Parallel Programming Using C++
Parallel Programming Using C++
MPI-The Complete Reference, Volume 1: The MPI Core
MPI-The Complete Reference, Volume 1: The MPI Core
OpenMP: An Industry-Standard API for Shared-Memory Programming
IEEE Computational Science & Engineering
Generic Programming for Parallel Mesh Problems
ISCOPE '99 Proceedings of the Third International Symposium on Computing in Object-Oriented Parallel Environments
Parallel Programmability and the Chapel Language
International Journal of High Performance Computing Applications
Stencil computation optimization and auto-tuning on state-of-the-art multicore architectures
Proceedings of the 2008 ACM/IEEE conference on Supercomputing
Runtime support for multicore Haskell
Proceedings of the 14th ACM SIGPLAN international conference on Functional programming
Parallel performance tuning for Haskell
Proceedings of the 2nd ACM SIGPLAN symposium on Haskell
Communications of the ACM
Queue - Interoperability
Efficient parallel stencil convolution in Haskell
Proceedings of the 4th ACM symposium on Haskell
Hi-index | 0.02 |
Stencils are typical building blocks for many numerical scientific applications. Different parallelization methods exist, the choice of a method depends on the given stencil, parallel programming system etc. Implementing stencils in a library simplifies application programming, allows to experiment with different parallelization methods, and supports their automatic adaptation to a given stencil. This paper introduces PASTHA, a prototype for a Haskell library that allows to declaratively describe stencil-based problems and calculate them in parallel. The description is flexible enough to cover all 2D stencils we are aware of. Implementation is based on task queues and strict evaluation. We report on experiments with a Gauß-Seidel stencil, where we achieved speedups of up to 4 on six cores, and with global and local sequence scoring from the Haskell bioinformatics library bio. For local scoring, the running time was reduced by a factor of 55, which is partially due to PASTHA.