Cache size in a cost model for heterogeneous skeletons

Authors:
Khari Armih;Greg Michaelson;Phil Trinder
Affiliations:
Heriot-Watt University, Edinburgh, United Kingdom;Heriot-Watt University, Edinburgh, United Kingdom;Heriot-Watt University, Edinburgh, Upper Volta
Venue:
Proceedings of the fifth international workshop on High-level parallel programming and applications
Year:
2011

Citing 26
Cited 0

A bridging model for parallel computation

Communications of the ACM
Algorithmic skeletons: structured management of parallel computation

Algorithmic skeletons: structured management of parallel computation
LogP: towards a realistic model of parallel computation

PPOPP '93 Proceedings of the fourth ACM SIGPLAN symposium on Principles and practice of parallel programming
Structured development of parallel programs

Structured development of parallel programs
Performance of hybrid message-passing and shared-memory parallelism for discrete element modeling

Proceedings of the 2000 ACM/IEEE conference on Supercomputing
MPI: The Complete Reference

MPI: The Complete Reference
Research Directions in Parallel Functional Programming

Research Directions in Parallel Functional Programming
HiHCoHP: Toward a Realistic Communication Model for Hierarchical HyperClusters of Heterogeneous Processors

IPDPS '01 Proceedings of the 15th International Parallel & Distributed Processing Symposium
Parallel Programming Using Skeleton Functions

PARLE '93 Proceedings of the 5th International PARLE Conference on Parallel Architectures and Languages Europe
A Skeleton Library

Euro-Par '02 Proceedings of the 8th International Euro-Par Conference on Parallel Processing
Patterns and skeletons for parallel and distributed computing

Patterns and skeletons for parallel and distributed computing
Models of parallel computation: a survey and synthesis

HICSS '95 Proceedings of the 28th Hawaii International Conference on System Sciences
Parallelism in random access machines

STOC '78 Proceedings of the tenth annual ACM symposium on Theory of computing
Object Recognition from Local Scale-Invariant Features

ICCV '99 Proceedings of the International Conference on Computer Vision-Volume 2 - Volume 2
LogGP: Incorporating Long Messages into the LogP Model --- One step closer towards a realistic model for parallel computation

LogGP: Incorporating Long Messages into the LogP Model --- One step closer towards a realistic model for parallel computation
Distinctive Image Features from Scale-Invariant Keypoints

International Journal of Computer Vision
Bringing skeletons out of the closet: a pragmatic manifesto for skeletal parallel programming

Parallel Computing
A library of constructive skeletons for sequential style of parallel programming

InfoScale '06 Proceedings of the 1st international conference on Scalable information systems
A Parallel Computational Model for Heterogeneous Clusters

IEEE Transactions on Parallel and Distributed Systems
MapReduce: simplified data processing on large clusters

Communications of the ACM - 50th anniversary issue: 1958 - 2008
Using OpenMP: Portable Shared Memory Parallel Programming (Scientific and Engineering Computation)

Using OpenMP: Portable Shared Memory Parallel Programming (Scientific and Engineering Computation)
Hybrid MPI/OpenMP Parallel Programming on Clusters of Multi-Core SMP Nodes

PDP '09 Proceedings of the 2009 17th Euromicro International Conference on Parallel, Distributed and Network-based Processing
Early experiments with the OpenMP/MPI hybrid programming model

IWOMP'08 Proceedings of the 4th international conference on OpenMP in a new era of parallelism
Skandium: Multi-core Programming with Algorithmic Skeletons

PDP '10 Proceedings of the 2010 18th Euromicro Conference on Parallel, Distributed and Network-based Processing
Flexible skeletal programming with eskel

Euro-Par'05 Proceedings of the 11th international Euro-Par conference on Parallel Processing
Two fundamental concepts in skeletal parallel programming

ICCS'05 Proceedings of the 5th international conference on Computational Science - Volume Part II

Quantified Score

Hi-index	0.00

Visualization

Abstract

High performance architectures are increasingly heterogeneous with shared and distributed memory components. Programming such architectures is complicated and performance portability is a major issue as the architectures evolve. This paper proposes a new architectural cost model that accounts for cache size and improves on heterogeneous architectures, and demonstrates a skeleton-based programming model that simplifies programming heterogeneous architectures. We further demonstrate that the cost model can be exploited by skeletons to improve load balancing on heterogeneous architectures. The heterogeneous skeleton model facilitates performance portability, using the architectural cost model to automatically balance load across heterogeneous components of the architecture. For both a data parallel benchmark, and realistic image processing program we obtain good performance for the heterogeneous skeleton on homogeneous shared and distributed memory architectures, and on three heterogeneous architectures. We also show that taking cache size into account in the model leads to improved balance and performance.