A parallel graph partitioning algorithm for a message-passing multiprocessor
International Journal of Parallel Programming
A multilevel algorithm for partitioning graphs
Supercomputing '95 Proceedings of the 1995 ACM/IEEE conference on Supercomputing
How Good is Recursive Bisection?
SIAM Journal on Scientific Computing
A parallel algorithm for multilevel graph partitioning and sparse matrix ordering
Journal of Parallel and Distributed Computing
Multilevel k-way partitioning scheme for irregular graphs
Journal of Parallel and Distributed Computing
Parallel optimisation algorithms for multilevel mesh partitioning
Parallel Computing - Special issue on graph partioning and parallel computing
MPI: The Complete Reference
Mesh Partitioning: A Multilevel Balancing and Refinement Algorithm
SIAM Journal on Scientific Computing
Rank Reordering Strategy for MPI Topology Creation Functions
Proceedings of the 5th European PVM/MPI Users' Group Meeting on Recent Advances in Parallel Virtual Machine and Message Passing Interface
A linear-time heuristic for improving network partitions
DAC '82 Proceedings of the 19th Design Automation Conference
Proceedings of the 20th annual international conference on Supercomputing
Topology mapping for Blue Gene/L supercomputer
Proceedings of the 2006 ACM/IEEE conference on Supercomputing
Towards an Efficient Process Placement Policy for MPI Applications in Multicore Environments
Proceedings of the 16th European PVM/MPI Users' Group Meeting on Recent Advances in Parallel Virtual Machine and Message Passing Interface
Optimizing task layout on the Blue Gene/L supercomputer
IBM Journal of Research and Development
Scalable communication protocols for dynamic sparse data exchange
Proceedings of the 15th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming
A parallel algorithm to compute data synopsis
WSEAS Transactions on Information Science and Applications
Near-optimal placement of MPI processes on hierarchical NUMA architectures
Euro-Par'10 Proceedings of the 16th international Euro-Par conference on Parallel processing: Part II
A flexible multi-layered virtual machine design for virtual laboratories in grid systems
SMO'05 Proceedings of the 5th WSEAS international conference on Simulation, modelling and optimization
Generic topology mapping strategies for large-scale parallel architectures
Proceedings of the international conference on Supercomputing
Mapping communication layouts to network hardware characteristics on massive-scale blue gene systems
Computer Science - Research and Development
Improving MPI applications performance on multicore clusters with rank reordering
EuroMPI'11 Proceedings of the 18th European MPI Users' Group conference on Recent advances in the message passing interface
Multi-core and network aware MPI topology functions
EuroMPI'11 Proceedings of the 18th European MPI Users' Group conference on Recent advances in the message passing interface
What MPI could (and cannot) do for mesh-partitioning on non-homogeneous networks
EuroPVM/MPI'06 Proceedings of the 13th European PVM/MPI User's Group conference on Recent advances in parallel virtual machine and message passing interface
An approach to creating performance visualizations in a parallel profile analysis tool
Euro-Par'11 Proceedings of the 2011 international conference on Parallel Processing - Volume 2
Direct graph k-partitioning with a Kernighan-Lin like heuristic
Operations Research Letters
Productive Parallel Linear Algebra Programming with Unstructured Topology Adaption
CCGRID '12 Proceedings of the 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (ccgrid 2012)
Improving performance of openSHMEM reference library by portable PE mapping technique
Proceedings of the 27th international ACM conference on International conference on supercomputing
Optimized process placement for collective I/O operations
Proceedings of the 20th European MPI Users' Group Meeting
Hi-index | 0.00 |
The topology functionality of the Message Passing Interface (MPI) provides a portable, architecture-independent means for adapting application programs to the communication architecture of the target hardware. However, current MPI implementations rarely go beyond the most trivial implementation, and simply performs no process remapping.We discuss the potential of the topology mechanism for systems with a hierarchical communication architecture like clusters of SMP nodes. The MPI topology functionality is a weak mechanism, and we argue about some of its shortcomings. We formulate the topology optimization problem as a graph embedding problem, and show that for hierarchical systems it can be solved by graph partitioning. We state the properties of a new heuristic for solving both the embedding problem and the "easier" graph partitioning problem.The graph partitioning based framework has been fully implemented in MPI/SX for the NEC SX-series of parallel vector computers. MPI/SX is thus one of very few MPI implementations with a non-trivial topology functionality. On a 4 node NEC SX-6 significant communication performance improvements are achieved with synthetic MPI benchmarks.