Array decompositions for nonuniform computational environments
Journal of Parallel and Distributed Computing
A complete anytime algorithm for number partitioning
Artificial Intelligence
Using MPI (2nd ed.): portable parallel programming with the message-passing interface
Using MPI (2nd ed.): portable parallel programming with the message-passing interface
Matrix Multiplication on Heterogeneous Platforms
IEEE Transactions on Parallel and Distributed Systems
The Differencing Method of Set Partitioning
The Differencing Method of Set Partitioning
An FPGA implementation of the two-dimensional finite-difference time-domain (FDTD) algorithm
FPGA '04 Proceedings of the 2004 ACM/SIGDA 12th international symposium on Field programmable gate arrays
Numerical Optimization: Theoretical and Practical Aspects (Universitext)
Numerical Optimization: Theoretical and Practical Aspects (Universitext)
On Grid-based Matrix Partitioning for Heterogeneous Processors
ISPDC '07 Proceedings of the Sixth International Symposium on Parallel and Distributed Computing
Finite Difference Time Domain (FDTD) Simulations Using Graphics Processors
HPCMP-UGC '07 Proceedings of the 2007 DoD High Performance Computing Modernization Program Users Group Conference
A perfectly matched layer for the absorption of electromagnetic waves
Journal of Computational Physics
High Performance Heterogeneous Computing
High Performance Heterogeneous Computing
Hi-index | 0.00 |
A model for the computational cost of the finite-difference time-domain (FDTD) method irrespective of implementation details or the application domain is given. The model is used to formalize the problem of optimal distribution of computational load to an arbitrary set of resources across a heterogeneous cluster. We show that the problem can be formulated as a minimax optimization problem and derive analytic lower bounds for the computational cost. The work provides insight into optimal design of FDTD parallel software. Our formulation of the load distribution problem takes simultaneously into account the computational and communication costs. We demonstrate that significant performance gains, as much as 75%, can be achieved by proper load distribution.