Design of efficient Java message-passing collectives on multi-core clusters

  • Authors:
  • Guillermo L. Taboada;Sabela Ramos;Juan Touriño;Ramón Doallo

  • Affiliations:
  • Computer Architecture Group, Dept. of Electronics and Systems, University of A Coruña, A Coruña, Spain;Computer Architecture Group, Dept. of Electronics and Systems, University of A Coruña, A Coruña, Spain;Computer Architecture Group, Dept. of Electronics and Systems, University of A Coruña, A Coruña, Spain;Computer Architecture Group, Dept. of Electronics and Systems, University of A Coruña, A Coruña, Spain

  • Venue:
  • The Journal of Supercomputing
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper presents a scalable and efficient Message-Passing in Java (MPJ) collective communication library for parallel computing on multi-core architectures. The continuous increase in the number of cores per processor underscores the need for scalable parallel solutions. Moreover, current system deployments are usually multi-core clusters, a hybrid shared/distributed memory architecture which increases the complexity of communication protocols. Here, Java represents an attractive choice for the development of communication middleware for these systems, as it provides built-in networking and multithreading support. As the gap between Java and compiled languages performance has been narrowing for the last years, Java is an emerging option for High Performance Computing (HPC).Our MPJ collective communication library increases Java HPC applications performance on multi-core clusters: (1) providing multi-core aware collective primitives; (2) implementing several algorithms (up to six) per collective operation, whereas publicly available MPJ libraries are usually restricted to one algorithm; (3) analyzing the efficiency of thread-based collective operations; (4) selecting at runtime the most efficient algorithm depending on the specific multi-core system architecture, and the number of cores and message length involved in the collective operation; (5) supporting the automatic performance tuning of the collectives depending on the system and communication parameters; and (6) allowing its integration in any MPJ implementation as it is based on MPJ point-to-point primitives. A performance evaluation on an InfiniBand and Gigabit Ethernet multi-core cluster has shown that the implemented collectives significantly outperform the original ones, as well as higher speedups when analyzing the impact of their use on collective communications intensive Java HPC applications. Finally, the presented library has been successfully integrated in MPJ Express ( http://mpj-express.org ), and will be distributed with the next release.