Optimized process placement for collective I/O operations

Authors:
Vishwanath Venkatesan;Rakhi Anand;Jaspal Subhlok;Edgar Gabriel
Affiliations:
University of Houston;University of Houston;University of Houston;University of Houston
Venue:
Proceedings of the 20th European MPI Users' Group Meeting
Year:
2013

Citing 9
Cited 0

Improved parallel I/O via a two-phase run-time access strategy

ACM SIGARCH Computer Architecture News - Special issue on input/output in parallel computer systems
Implementing the MPI process topology mechanism

Proceedings of the 2002 ACM/IEEE conference on Supercomputing
MPIPP: an automatic profile-guided parallel process placement toolset for SMP clusters and multiclusters

Proceedings of the 20th annual international conference on Supercomputing
Topology mapping for Blue Gene/L supercomputer

Proceedings of the 2006 ACM/IEEE conference on Supercomputing
Dynamic topology aware load balancing algorithms for molecular dynamics applications

Proceedings of the 23rd international conference on Supercomputing
Performance Evaluation of Collective Write Algorithms in MPI I/O

ICCS '09 Proceedings of the 9th International Conference on Computational Science: Part I
Near-optimal placement of MPI processes on hierarchical NUMA architectures

Euro-Par'10 Proceedings of the 16th international Euro-Par conference on Parallel processing: Part II
Generic topology mapping strategies for large-scale parallel architectures

Proceedings of the international conference on Supercomputing
OMPIO: a modular software architecture for MPI I/O

EuroMPI'11 Proceedings of the 18th European MPI Users' Group conference on Recent advances in the message passing interface

Quantified Score

Hi-index	0.00

Visualization

Abstract

Mapping of MPI processes to the available resources is an increasingly complex but important task on modern parallel systems. This paper presents a new approach to optimize the process placement of a parallel application based on its I/O access pattern. The paper introduces the SetMatch process mapping algorithm, which significantly reduces the cost of the communication occurring in collective I/O operations. The effectiveness of the approach has been evaluated for multiple scenarios on a PVFS2 file system. Our results demonstrate significant improvements in the communication time of collective I/O operations as well as improvements in the overall application execution time with our mapping strategy. The generalized SetMatch algorithm was the only mapping strategy that was able to provide adequate performance for all scenarios used in this paper.