ClusterNet: An Object-Oriented Cluster Network
IPDPS '00 Proceedings of the 15 IPDPS 2000 Workshops on Parallel and Distributed Processing
IPPS '98 Proceedings of the 12th. International Parallel Processing Symposium on International Parallel Processing Symposium
Implications of application usage characteristics for collective communication offload
International Journal of High Performance Computing and Networking
Hardware support for OpenMP collective operations
LCPC'09 Proceedings of the 22nd international conference on Languages and Compilers for Parallel Computing
Hi-index | 0.00 |
Typical communication networks for parallel processing are based on sending data from one processor to one, or all, of the other processors. Using such a network, many simple operations that require information from every processor requires many point-to-point or broadcast communications. These aggregate operations can be as simple as a barrier synchronization or as complex as an arithmetic reduction. In this paper, we discuss a class of networks that directly implement a wide range of aggregate operations. These networks are capable of performing aggregate operations in a single communication operation using only simple bitwise combining logic in a trivially scalable tree configuration.