Task scheduling in parallel and distributed systems
Task scheduling in parallel and distributed systems
Using MPI: portable parallel programming with the message-passing interface
Using MPI: portable parallel programming with the message-passing interface
PVM: Parallel virtual machine: a users' guide and tutorial for networked parallel computing
PVM: Parallel virtual machine: a users' guide and tutorial for networked parallel computing
Computing Maximum Task Execution Times — A Graph-BasedApproach
Real-Time Systems
Benchmarking and comparison of the task graph scheduling algorithms
Journal of Parallel and Distributed Computing
A Case for NOW (Networks of Workstations)
IEEE Micro
Performance Prediction Methodology for Parallel Programs with MPI in NOW Environments
IWDC '02 Proceedings of the 4th International Workshop on Distributed Computing, Mobile and Wireless Computing
Solving Computational Grand Challenges Using a Network of Heterogeneous Supercomputers
Proceedings of the Fifth SIAM Conference on Parallel Processing for Scientific Computing
Implementation of Visuel MPI Parallel Program Performance Analysis Tool for Cluster Environments
AINA '05 Proceedings of the 19th International Conference on Advanced Information Networking and Applications - Volume 2
Visual programming support for graph-oriented parallel-distributed processing: Research Articles
Software—Practice & Experience
Automatic Clustering of Grid Nodes
GRID '05 Proceedings of the 6th IEEE/ACM International Workshop on Grid Computing
XtratuM/PPC: a hypervisor for partitioned system on PowerPC processors
The Journal of Supercomputing
Hi-index | 0.01 |
Advances in computer technology, encompassed with fast emerging of multicore processor technology, have made the many-core personal computers available and more affordable. The availability of network of workstations and cluster of many-core SMPs have made them an attractive solution for high performance computing by providing computational power equal or superior to supercomputers or mainframes at an affordable cost using commodity components. In order to search alternative ways to extract unused and idle computing power from these computing resources targeting to improve overall performance, as well as to fully utilize the underlying new hardware platforms, these are major topics in this field of research. In this research paper, the design rationale and implementation of an effective toolkit for performance measurement and analysis of parallel applications in cluster environments is introduced; not only generating parallel applications' timing graph representation, but also to provide application execution's performance data charts. The goal in developing this toolkit is to permit application developers have a better understanding of the application's behavior among selected computing nodes purposed for that particular execution. Additionally, multiple execution results of a given application under development can be combined and overlapped, permitting application developers to perform "what-if" analysis, i.e., to deeper understand the utilization of allocated computational resources. Experimentations using this toolkit have shown its effectiveness on the development and performance tuning of parallel applications, extending the use in teaching of message passing, and shared memory model parallel programming courses.