ScaLAPACK user's guide
A Proposal for a Heterogeneous Cluster ScaLAPACK (Dense Linear Solvers)
IEEE Transactions on Computers
MOSFET Modeling and Bsim3 User's Guide
MOSFET Modeling and Bsim3 User's Guide
HPCN Europe '99 Proceedings of the 7th International Conference on High-Performance Computing and Networking
Hi-index | 0.00 |
Although heterogeneous clusters are flexible and cost-effective, they entail intrinsic difficulties in optimization. Whereas it is simple to invoke multiple processes on fast processing elements (PEs) to alleviate load imbalance, the optimal process allocation is not obvious. Communication time is another problem. Though it is sometimes better to exclude slow PEs to avoid performance degradation, it is generally difficult to find the optimal PE configuration. In this study, the execution time is first modeled from the measurement results of various configurations. The derived models are then used to estimate the optimal PE configuration and process allocation. We implemented various models for HPL (High Performance Linpack benchmark) on a heterogeneous cluster, and estimated the optimal configurations for various problem sizes. In the case of a heterogeneous cluster of Athlon and Pentium-II, the execution time of the estimated optimal configuration was 0-7.4% longer than that of the actual optimal configuration. In a heterogeneous cluster of three kinds of processors that includes dual-processors, the excess time was 13.6-31.5%.