Improving inter-node communications in multi-core clusters using a contention-free process mapping algorithm

Authors:
Mohsen Soryani;Morteza Analoui;Ghobad Zarrinchian
Affiliations:
Iran University of Science and Technology, Tehran, Iran;Iran University of Science and Technology, Tehran, Iran;Iran University of Science and Technology, Tehran, Iran
Venue:
The Journal of Supercomputing
Year:
2013

Citing 9
Cited 0

A survey of graph layout problems

ACM Computing Surveys (CSUR)
Performance by Design: Computer Capacity Planning By Example

Performance by Design: Computer Capacity Planning By Example
Memory and Network Bandwidth Aware Scheduling of Multiprogrammed Workloads on Clusters of SMPs

ICPADS '06 Proceedings of the 12th International Conference on Parallel and Distributed Systems - Volume 1
MPIPP: an automatic profile-guided parallel process placement toolset for SMP clusters and multiclusters

Proceedings of the 20th annual international conference on Supercomputing
Understanding the Impact of Multi-Core Architecture in Cluster Computing: A Case Study with Intel Dual-Core System

CCGRID '07 Proceedings of the Seventh IEEE International Symposium on Cluster Computing and the Grid
Mapping Algorithms for Multiprocessor Tasks on Multi-Core Clusters

ICPP '08 Proceedings of the 2008 37th International Conference on Parallel Processing
Towards an Efficient Process Placement Policy for MPI Applications in Multicore Environments

Proceedings of the 16th European PVM/MPI Users' Group Meeting on Recent Advances in Parallel Virtual Machine and Message Passing Interface
Topology-aware task mapping for reducing communication contention on large parallel machines

IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
Impact of Inter-application Contention in Current and Future HPC Systems

HOTI '10 Proceedings of the 2010 18th IEEE Symposium on High Performance Interconnects

Quantified Score

Hi-index	0.00

Visualization

Abstract

High performance clusters, which are established by connecting many computing nodes together, are known as one of main architectures to obtain extremely high performance. Currently, these systems are moving from multi-core architectures to many-core architectures to enhance their computational capabilities. This trend would eventually cause network interfaces to be a performance bottleneck because these interfaces are few in number and cannot handle multiple network requests at a time. The consequence of such issue would be higher waiting time at the network interface queue and lower performance. In this paper, we tackle this problem by introducing a process mapping algorithm, which attempts to improve inter-node communications in multi-core clusters. Our mapping strategy reduces accesses to the network interface by distributing communication-intensive processes among computing nodes, which leads to lower waiting time at the network interface queue. Performance results for synthetic and real workloads reveal that the proposed strategy improves the performance from 8 % up to 90 % in tested cases compared to other methods.