A bridging model for parallel computation
Communications of the ACM
Direct bulk-synchronous parallel algorithms
Journal of Parallel and Distributed Computing
A bridging model for parallel computation, communication, and I/O
ACM Computing Surveys (CSUR) - Special issue: position statements on strategic directions in computing research
Efficient external memory algorithms by simulating coarse-grained parallel algorithms
Proceedings of the ninth annual ACM symposium on Parallel algorithms and architectures
The Paderborn University BSP (PUB) Library - Design, Implementation and Performance
IPPS '99/SPDP '99 Proceedings of the 13th International Symposium on Parallel Processing and the 10th Symposium on Parallel and Distributed Processing
PRO: A Model for Parallel Resource-Optimal Computation
HPCS '02 Proceedings of the 16th Annual International Symposium on High Performance Computing Systems and Applications
Portable list ranking: an experimental study
Journal of Experimental Algorithmics (JEA)
A computational study of external-memory BFS algorithms
SODA '06 Proceedings of the seventeenth annual ACM-SIAM symposium on Discrete algorithm
Efficient sampling of random permutations
Journal of Discrete Algorithms
Experiments with a parallel external memory system
HiPC'07 Proceedings of the 14th international conference on High performance computing
Bounded arboricity to determine the local structure of sparse graphs
WG'06 Proceedings of the 32nd international conference on Graph-Theoretic Concepts in Computer Science
Hi-index | 0.00 |
We present an extension to SSCRAP, our C++ environment for the development of coarse grained algorithms, that allows for easy execution of programs in an external memory setting. Our environment is well suited for regular as well as irregular problems and scales from low end PCs to high end clusters and mainframe technology. It allows running algorithms designed on a high level of abstraction in one of the known coarse grained parallel models without modification in an external memory setting. The first tests presented here in this paper show a very efficient behavior in the context of out-of-core computation (mapping memory to disk files), and even some (marginal) speed up when used to reduced cache misses for in-core computation.