Groups in bulk synchronous parallel computing

Authors:
Gonzalez J. A.;Leon C.;Piccoli F.;Printista M.;Roda J. L.;Rodriguez C.;Sande F.
Affiliations:
Departamento de E.I.O.C., Universidad de La Laguna, Facultad de Matemáticas. Tenerife, Spain;Departamento de E.I.O.C., Universidad de La Laguna, Facultad de Matemáticas. Tenerife, Spain;Universidad N. de San Luis, Ejercito de los Andes, San Luis, Argentina;Universidad N. de San Luis, Ejercito de los Andes, San Luis, Argentina;Departamento de E.I.O.C., Universidad de La Laguna, Facultad de Matemáticas. Tenerife, Spain;Departamento de E.I.O.C., Universidad de La Laguna, Facultad de Matemáticas. Tenerife, Spain;Departamento de E.I.O.C., Universidad de La Laguna, Facultad de Matemáticas. Tenerife, Spain
Venue:
EURO-PDP'00 Proceedings of the 8th Euromicro conference on Parallel and distributed processing
Year:
2000

Citing 6
Cited 0

Computational geometry: an introduction

Computational geometry: an introduction
The design and analysis of parallel algorithms

The design and analysis of parallel algorithms
A bridging model for parallel computation

Communications of the ACM
Algorithm 64: Quicksort

Communications of the ACM
h-Relation Models for Current Standard Parallel Platforms

Euro-Par '98 Proceedings of the 4th International Euro-Par Conference on Parallel Processing
Performance and Predictability of MPI and BSP Programs on the CRAY T3E

Proceedings of the 6th European PVM/MPI Users' Group Meeting on Recent Advances in Parallel Virtual Machine and Message Passing Interface

Quantified Score

Hi-index	0.00

Visualization

Abstract

An extension to the Bulk Synchronous Parallel Model (BSP) to allow the use of asynchronous BSP groups of processors is presented. In this model, called Nested BSP, processor groups can be divided and processors in a group synchronize through group dependent Collective operations generalizing the concept of barrier synchronization. A classification of Problems and Algorithms attending to their parallel Input-Output distribution is provided. For one of these problem classes, the called Common-Common class, we present a general strategy to derive efficient parallel algorithms. Algorithms belonging to this class allow the arbitrary division of the processor subsets, easing the opportunities of the underlying BSP software to divide the network in independent sub networks, minimizing the impact of the traffic in the rest of the network in the predicted cost. The expressiveness of the model is exemplified through three divide and conquer programs. The computational results for these programs in six high performance supercomputers show both the accuracy of the model and the optimality of the speedups for the class of problems considered.