A Mapping Strategy for Parallel Processing
IEEE Transactions on Computers
Journal of Computational Physics
Heuristic Technique for Processor and Link Assignment in Multicomputers
IEEE Transactions on Computers
How Good is Recursive Bisection?
SIAM Journal on Scientific Computing
End-to-end congestion control for the internet: delays and stability
IEEE/ACM Transactions on Networking (TON)
Computers and Intractability: A Guide to the Theory of NP-Completeness
Computers and Intractability: A Guide to the Theory of NP-Completeness
HPCN Europe 1996 Proceedings of the International Conference and Exhibition on High-Performance Computing and Networking
Issues in the Study of Graph Embeddings
WG '80 Proceedings of the International Workshop on Graphtheoretic Concepts in Computer Science
Implementing the MPI process topology mechanism
Proceedings of the 2002 ACM/IEEE conference on Supercomputing
Reducing the bandwidth of sparse symmetric matrices
ACM '69 Proceedings of the 1969 24th national conference
Sourcebook of parallel computing
Sourcebook of parallel computing
Topology mapping for Blue Gene/L supercomputer
Proceedings of the 2006 ACM/IEEE conference on Supercomputing
IEEE Transactions on Computers
Dynamic topology aware load balancing algorithms for molecular dynamics applications
Proceedings of the 23rd international conference on Supercomputing
High-performance graph algorithms from parallel sparse matrices
PARA'06 Proceedings of the 8th international conference on Applied parallel computing: state of the art in scientific computing
The PERCS High-Performance Interconnect
HOTI '10 Proceedings of the 2010 18th IEEE Symposium on High Performance Interconnects
The scalable process topology interface of MPI 2.2
Concurrency and Computation: Practice & Experience
Writing parallel libraries with MPI - common practice, issues, and extensions
EuroMPI'11 Proceedings of the 18th European MPI Users' Group conference on Recent advances in the message passing interface
Avoiding hot-spots on two-level direct networks
Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis
Productive Parallel Linear Algebra Programming with Unstructured Topology Adaption
CCGRID '12 Proceedings of the 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (ccgrid 2012)
A divide and conquer strategy for scaling weather simulations with multiple regions of interest
SC '12 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
SC '12 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
Hierarchical task mapping of cell-based AMR cosmology simulations
SC '12 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
Topology-aware mappings for large-scale eigenvalue problems
Euro-Par'12 Proceedings of the 18th international conference on Parallel Processing
A scalable infiniband network topology-aware performance analysis tool for MPI
Euro-Par'12 Proceedings of the 18th international conference on Parallel processing workshops
Analysis of topology-dependent MPI performance on Gemini networks
Proceedings of the 20th European MPI Users' Group Meeting
Optimized process placement for collective I/O operations
Proceedings of the 20th European MPI Users' Group Meeting
Proceedings of the ACM SIGCOMM 2013 conference on SIGCOMM
Predicting application performance using supervised learning on communication features
SC '13 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
Task mapping stencil computations for non-contiguous allocations
Proceedings of the 19th ACM SIGPLAN symposium on Principles and practice of parallel programming
A topology-aware load balancing algorithm for clustered hierarchical multi-core machines
Future Generation Computer Systems
A divide and conquer strategy for scaling weather simulations with multiple regions of interest
Scientific Programming - Selected Papers from Super Computing 2012
Hi-index | 0.00 |
The steadily increasing number of nodes in high-performance computing systems and the technology and power constraints lead to sparse network topologies. Efficient mapping of application communication patterns to the network topology gains importance as systems grow to petascale and beyond. Such mapping is supported in parallel programming frameworks such as MPI, but is often not well implemented. We show that the topology mapping problem is NP-complete and analyze and compare different practical topology mapping heuristics. We demonstrate an efficient and fast new heuristic which is based on graph similarity and show its utility with application communication patterns on real topologies. Our mapping strategies support heterogeneous networks and show significant reduction of congestion on torus, fat-tree, and the PERCS network topologies, for irregular communication patterns. We also demonstrate that the benefit of topology mapping grows with the network size and show how our algorithms can be used in a practical setting to optimize communication performance. Our efficient topology mapping strategies are shown to reduce network congestion by up to 80%, reduce average dilation by up to 50%, and improve benchmarked communication performance by 18%.