Data networks
Dynamic Remapping of Parallel Computations with Varying Resource Demands
IEEE Transactions on Computers
Optimal Dynamic Remapping of Data Parallel Computations
IEEE Transactions on Computers
Introduction to algorithms
Array decompositions for nonuniform computational environments
Journal of Parallel and Distributed Computing
On Runtime Parallel Scheduling for Processor Load Balancing
IEEE Transactions on Parallel and Distributed Systems
Customized dynamic load balancing for a network of workstations
Journal of Parallel and Distributed Computing
Dynamic load balancing in parallel discrete event simulation for spatially explicit problems
PADS '98 Proceedings of the twelfth workshop on Parallel and distributed simulation
Journal of Parallel and Distributed Computing - Special issue on dynamic load balancing
Parallel structures and dynamic load balancing for adaptive finite element computation
Proceedings of international centre for mathematical sciences on Grid adaptation in computational PDES : theory and applications: theory and applications
The grid
Using pathchar to estimate Internet link characteristics
SIGMETRICS '99 Proceedings of the 1999 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
Matrix Multiplication on Heterogeneous Platforms
IEEE Transactions on Parallel and Distributed Systems
A Proposal for a Heterogeneous Cluster ScaLAPACK (Dense Linear Solvers)
IEEE Transactions on Computers
High Performance Cluster Computing: Architectures and Systems
High Performance Cluster Computing: Architectures and Systems
Scheduling Divisible Loads in Parallel and Distributed Systems
Scheduling Divisible Loads in Parallel and Distributed Systems
Computer Networks
Computers and Intractability; A Guide to the Theory of NP-Completeness
Computers and Intractability; A Guide to the Theory of NP-Completeness
Scheduling and Load Balancing in Parallel and Distributed Systems
Scheduling and Load Balancing in Parallel and Distributed Systems
A Practical Approach to Dynamic Load Balancing
IEEE Transactions on Parallel and Distributed Systems
Congestion control for high bandwidth-delay product networks
Proceedings of the 2002 conference on Applications, technologies, architectures, and protocols for computer communications
HPCN Europe '99 Proceedings of the 7th International Conference on High-Performance Computing and Networking
Linear Algebra Algorithms in Heterogeneous Cluster of Personal Computers
HCW '00 Proceedings of the 9th Heterogeneous Computing Workshop
IEEE Communications Magazine
Efficient Assignment and Scheduling for Heterogeneous DSP Systems
IEEE Transactions on Parallel and Distributed Systems
Parallel Computing - Heterogeneous computing
Efficient hybrid parallelisation of tiled algorithms on SMP clusters
International Journal of Computational Science and Engineering
Mapping in heterogeneous systems with heuristic methods
PARA'06 Proceedings of the 8th international conference on Applied parallel computing: state of the art in scientific computing
BSB'07 Proceedings of the 2nd Brazilian conference on Advances in bioinformatics and computational biology
Cache topology aware computation mapping for multicores
PLDI '10 Proceedings of the 2010 ACM SIGPLAN conference on Programming language design and implementation
Algorithmic issues in grid computing
Algorithms and theory of computation handbook
Euro-Par 2010 Proceedings of the 2010 conference on Parallel processing
The design of a dynamic efficient load balancing algorithm on distributed networks
HPCC'06 Proceedings of the Second international conference on High Performance Computing and Communications
Information Sciences: an International Journal
A first step towards automatically building network representations
Euro-Par'07 Proceedings of the 13th international Euro-Par conference on Parallel Processing
Euro-Par'12 Proceedings of the 18th international conference on Parallel Processing
Hi-index | 0.00 |
Abstract--This paper is devoted to mapping iterative algorithms onto heterogeneous clusters. The application data is partitioned over the processors, which are arranged along a virtual ring. At each iteration, independent calculations are carried out in parallel, and some communications take place between consecutive processors in the ring. The question is to determine how to slice the application data into chunks, and to assign these chunks to the processors, so that the total execution time is minimized. One major difficulty is to embed a processor ring into a network that typically is not fully connected, so that some communication links have to be shared by several processor pairs. We establish a complexity result that assesses the difficulty of this problem, and we design a practical heuristic that provides efficient mapping, routing, link-sharing, and data distribution schemes.