Compiling Fortran D for MIMD distributed-memory machines
Communications of the ACM
An overview of High Performance Fortran
ACM SIGPLAN Fortran Forum
PVM: Parallel virtual machine: a users' guide and tutorial for networked parallel computing
PVM: Parallel virtual machine: a users' guide and tutorial for networked parallel computing
Co-array Fortran for parallel programming
ACM SIGPLAN Fortran Forum
Using MPI (2nd ed.): portable parallel programming with the message-passing interface
Using MPI (2nd ed.): portable parallel programming with the message-passing interface
The Case for High-Level Parallel Programming in ZPL
IEEE Computational Science & Engineering
A cellular computer to implement the kalman filter algorithm
A cellular computer to implement the kalman filter algorithm
X10: an object-oriented approach to non-uniform cluster computing
OOPSLA '05 Proceedings of the 20th annual ACM SIGPLAN conference on Object-oriented programming, systems, languages, and applications
Analyzing block locality in Morton-order and Morton-hybrid matrices
MEDEA '06 Proceedings of the 2006 workshop on MEmory performance: DEaling with Applications, systems and architectures
Analyzing block locality in Morton-order and Morton-hybrid matrices
ACM SIGARCH Computer Architecture News
Hi-index | 0.00 |
In this paper, we show our initial experience with a class of objects, called Hierarchically Tiled Arrays (HTAs), that encapsulate parallelism. HTAs allow the construction of single-threaded parallel programs where a master process distributes tasks to be executed by a collection of servers holding the components (tiles) of the HTAs. The tiled and recursive nature of HTAs facilitates the adaptation of the programs that use them to varying machine configurations, and eases the mapping of data and tasks to parallel computers with a hierarchical organization. We have implemented HTAs as a MATLAB™ toolbox, overloading conventional operators and array functions such that HTA operations appear to the programmer as extensions of MATLAB™. Our experiments show that the resulting environment is ideal for the prototyping of parallel algorithms and greatly improves the ease of development of parallel programs while providing reasonable performance.