Adapting distributed scientific applications to run-time network conditions

  • Authors:
  • Masha Sosonkina

  • Affiliations:
  • Ames Laboratory and Iowa State University, Ames, IA

  • Venue:
  • PARA'04 Proceedings of the 7th international conference on Applied Parallel Computing: state of the Art in Scientific Computing
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

High-performance applications place great demands on computation and communication resources of distributed computing platforms. If the availability of resources changes dynamically, the application performance may suffer, which is especially true for clusters. Thus, it is desirable to make an application aware of system run-time changes and to adapt it dynamically to the new conditions. We show how this may be done using a helper tool (middleware NICAN). In our experiments, NICAN implements a packet probing technique to detect contention on cluster nodes while a distributed iterative linear system solver from the pARMS package is executing. Adapting the solver to the discovered network conditions may result in faster iterative convergence.