On communication latency in PRAM computations
SPAA '89 Proceedings of the first annual ACM symposium on Parallel algorithms and architectures
Communication complexity of PRAMs
Theoretical Computer Science - Special issue: Fifteenth international colloquium on automata, languages and programming, Tampere, Finland, July 1988
A bridging model for parallel computation
Communications of the ACM
Towards a single model of efficient computation in real parallel machines
PARLE '91 Proceedings on Parallel architectures and languages Europe : volume I: parallel architectures and algorithms: volume I: parallel architectures and algorithms
A comparison of sorting algorithms for the connection machine CM-2
SPAA '91 Proceedings of the third annual ACM symposium on Parallel algorithms and architectures
Implementations of randomized sorting on large parallel machines
SPAA '92 Proceedings of the fourth annual ACM symposium on Parallel algorithms and architectures
The network architecture of the Connection Machine CM-5 (extended abstract)
SPAA '92 Proceedings of the fourth annual ACM symposium on Parallel algorithms and architectures
LogP: towards a realistic model of parallel computation
PPOPP '93 Proceedings of the fourth ACM SIGPLAN symposium on Principles and practice of parallel programming
Experiences with a model for parallel computation
PODC '93 Proceedings of the twelfth annual ACM symposium on Principles of distributed computing
General purpose parallel computing
Lectures on parallel computation
Parallel programming in Split-C
Proceedings of the 1993 ACM/IEEE conference on Supercomputing
MPI: a message passing interface
Proceedings of the 1993 ACM/IEEE conference on Supercomputing
Communication primitive for BSP computers
Information Processing Letters
Towards efficiency and portability: programming with the BSP model
Proceedings of the eighth annual ACM symposium on Parallel algorithms and architectures
A quantitative comparison of parallel computation models
Proceedings of the eighth annual ACM symposium on Parallel algorithms and architectures
Data Structures and Algorithms
Data Structures and Algorithms
Direct Bulk-Synchronous Parallel Algorithms
SWAT '92 Proceedings of the Third Scandinavian Workshop on Algorithm Theory
The E-BSP Model: Incorporating General Locality and Unbalanced Communication into the BSP Model
Euro-Par '96 Proceedings of the Second International Euro-Par Conference on Parallel Processing-Volume II
I/O complexity: The red-blue pebble game
STOC '81 Proceedings of the thirteenth annual ACM symposium on Theory of computing
Parallelism in random access machines
STOC '78 Proceedings of the tenth annual ACM symposium on Theory of computing
Experimental Validation of Parallel Computation Models on the Intel Paragon
IPPS '98 Proceedings of the 12th. International Parallel Processing Symposium on International Parallel Processing Symposium
On the Predictive Quality of BSP-like Cost Functions for NOWs
Euro-Par '00 Proceedings from the 6th International Euro-Par Conference on Parallel Processing
Algorithm engineering for parallel computation
Experimental algorithmics
A survey of research and practices of Network-on-chip
ACM Computing Surveys (CSUR)
ICPADS '06 Proceedings of the 12th International Conference on Parallel and Distributed Systems - Volume 2
A Parallel Computational Model for Heterogeneous Clusters
IEEE Transactions on Parallel and Distributed Systems
Predictability of bulk synchronous programs using MPI
EURO-PDP'00 Proceedings of the 8th Euromicro conference on Parallel and distributed processing
Algorithm engineering: bridging the gap between algorithm theory and practice
Algorithm engineering: bridging the gap between algorithm theory and practice
A lower bound technique for communication on BSP with application to the FFT
Euro-Par'12 Proceedings of the 18th international conference on Parallel Processing
Hi-index | 0.00 |
In recent years, a large number of parallel computation models have been proposed to replace the PRAM as the parallel computation model presented to the algorithm designer. Although mostly the theoretical justifications for these models are sound, and many algorithmic results where obtained through these models, little experimentation has been conducted to validate the effectiveness of these models for developing cost-effective algorithms and applications on existing hardware platforms. In this article a first attempt is made to perform a detailed experimental account on the preciseness of these models. The achieve this, three models (BSP, E-BSP, and BPRAM) were selected and validated on five parallel platforms (Cray T3E, Thinking Machines CM-5, Intel Paragon, MasPar MP-1, and Parsytec GCel). The work described in this article consists of three parts. First, the predictive capabilities of the models are investigated. Unlike previous experimental work, which mostly demonstrated a close match between the measuredd and predicted execution times, this article shows that there are several situations in which the models do not precisely predict the actual runtime behavior of an algorithm implementation. Second, a comparison between the models is provided in order to determine the model that induces that most efficient algorithms. Lastly, the performance achieved by the model-derived algorithms is compared with the performace attained by machine-specific algorithms in order to examine the effectiveness of deriving fast algorithms through the formalisms of the models.