Performance-based parallel application toolkit for high-performance clusters

  • Authors:
  • Kuan-Ching Li;Tien-Hsiung Weng

  • Affiliations:
  • Dept. of Computer Science and Information Engineering, Providence University, Taichung, Taiwan;Dept. of Computer Science and Information Engineering, Providence University, Taichung, Taiwan

  • Venue:
  • The Journal of Supercomputing
  • Year:
  • 2009

Quantified Score

Hi-index 0.01

Visualization

Abstract

Advances in computer technology, encompassed with fast emerging of multicore processor technology, have made the many-core personal computers available and more affordable. The availability of network of workstations and cluster of many-core SMPs have made them an attractive solution for high performance computing by providing computational power equal or superior to supercomputers or mainframes at an affordable cost using commodity components. In order to search alternative ways to extract unused and idle computing power from these computing resources targeting to improve overall performance, as well as to fully utilize the underlying new hardware platforms, these are major topics in this field of research. In this research paper, the design rationale and implementation of an effective toolkit for performance measurement and analysis of parallel applications in cluster environments is introduced; not only generating parallel applications' timing graph representation, but also to provide application execution's performance data charts. The goal in developing this toolkit is to permit application developers have a better understanding of the application's behavior among selected computing nodes purposed for that particular execution. Additionally, multiple execution results of a given application under development can be combined and overlapped, permitting application developers to perform "what-if" analysis, i.e., to deeper understand the utilization of allocated computational resources. Experimentations using this toolkit have shown its effectiveness on the development and performance tuning of parallel applications, extending the use in teaching of message passing, and shared memory model parallel programming courses.