An optimizing Fortran D compiler for MIMD distributed-memory machines
An optimizing Fortran D compiler for MIMD distributed-memory machines
Overlapping computations, communications and I/O in parallel sorting
Journal of Parallel and Distributed Computing
Predictive analysis of a wavefront application using LogGP
Proceedings of the seventh ACM SIGPLAN symposium on Principles and practice of parallel programming
A Loop Transformation Algorithm for Communication Overlapping
International Journal of Parallel Programming - Special issue on international symposium on high performance computing 1997, part I
Accurately Selecting Block Size at Runtime in Pipelined Parallel Programs
International Journal of Parallel Programming
Communication overlap in multi-tier parallel algorithms
SC '98 Proceedings of the 1998 ACM/IEEE conference on Supercomputing
Virtual Memory Management in Data Parallel Applications
HPCN Europe '99 Proceedings of the 7th International Conference on High-Performance Computing and Networking
Parallel Out-of-Core Matrix Inversion
IPDPS '02 Proceedings of the 16th International Parallel and Distributed Processing Symposium
Automatic Selection of Dynamic Data Partitioning Schemes for Distributed-Memory Multicomputers
LCPC '95 Proceedings of the 8th International Workshop on Languages and Compilers for Parallel Computing
Optimal Grain Size Computation for Pipelined Algorithms
Euro-Par '96 Proceedings of the Second International Euro-Par Conference on Parallel Processing - Volume I
Getting More from Out-of-Core Columnsort
ALENEX '02 Revised Papers from the 4th International Workshop on Algorithm Engineering and Experiments
A Synthesis of P rallel Out-of-core Sorting Programs on Heterogeneous Clusters
CCGRID '03 Proceedings of the 3st International Symposium on Cluster Computing and the Grid
Scalability Analysis of Multidimensional Wavefront Algorithms on Large-Scale SMP Clusters
FRONTIERS '99 Proceedings of the The 7th Symposium on the Frontiers of Massively Parallel Computation
A General Predictive Performance Model for Wavefront Algorithms on Clusters of SMPs
ICPP '00 Proceedings of the Proceedings of the 2000 International Conference on Parallel Processing
Compiler techniques for optimizing communication and data distribution for distributed-memory multicomputers
Optimal multi-image processing streaming framework on parallel heterogeneous systems
EG PGV'11 Proceedings of the 11th Eurographics conference on Parallel Graphics and Visualization
Hi-index | 0.00 |
Several numerical computation algorithms exhibit dependences that lead to a wavefront of the computation. Depending on the data distribution chosen, pipelining communication and computation can be the only way to avoid a sequential execution of the parallel code. The computation grain has to be wisely chosen to obtain at the same time a maximum parallelism and a small communication overhead. On the other hand, when the size of data exceeds the memory capacity of the target platform, data have to be stored on disk. The concept of out-of-core computation aims at minimizing the impact of the I/O needed to compute on such data. It has been applied successfully on several linear algebra applications. In this paper we apply out-of-core techniques to wavefront algorithms. The originality of our approach is to overlap computation, communication, and I/O. An original strategy is proposed using several memory blocks accessed in a cyclic way. The resulting pipeline algorithm achieves a saturation of the disk resource which is the bottleneck in out-of-core algorithms.