Deadlock-Free Message Routing in Multiprocessor Interconnection Networks
IEEE Transactions on Computers
The turn model for adaptive routing
ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
Synthetic-perturbation tuning of MIMD programs
The Journal of Supercomputing
A Framework-Based Approach to the Development of Network-Aware Applications
IEEE Transactions on Software Engineering
Performance monitoring in a Myrinet-connected SHRIMP cluster
SPDT '98 Proceedings of the SIGMETRICS symposium on Parallel and distributed tools
Designing and Building Parallel Programs: Concepts and Tools for Parallel Software Engineering
Designing and Building Parallel Programs: Concepts and Tools for Parallel Software Engineering
Application-Dependent Dynamic Monitoring of Distributed and Parallel Systems
IEEE Transactions on Parallel and Distributed Systems
IPDPS '02 Proceedings of the 16th International Parallel and Distributed Processing Symposium
A First Implementation of In-Transit Buffers on Myrinet GM Software
IPDPS '01 Proceedings of the 15th International Parallel & Distributed Processing Symposium
Core Algorithms of the Maui Scheduler
JSSPP '01 Revised Papers from the 7th International Workshop on Job Scheduling Strategies for Parallel Processing
A Runtime Monitoring Framework for the TAU Profiling System
ISCOPE '99 Proceedings of the Third International Symposium on Computing in Object-Oriented Parallel Environments
Prototype of AM3: Active Mapper and Monitoring Module for Myrinet Environments
LCN '02 Proceedings of the 27th Annual IEEE Conference on Local Computer Networks
Active harmony: towards automated performance tuning
Proceedings of the 2002 ACM/IEEE conference on Supercomputing
Asserting performance expectations
Proceedings of the 2002 ACM/IEEE conference on Supercomputing
Selective Buddy Allocation for Scheduling Parallel Jobs on Clusters
CLUSTER '02 Proceedings of the IEEE International Conference on Cluster Computing
Supermon: A High-Speed Cluster Monitoring System
CLUSTER '02 Proceedings of the IEEE International Conference on Cluster Computing
A Very Efficient Distributed Deadlock Detection Mechanism for Wormhole Networks
HPCA '98 Proceedings of the 4th International Symposium on High-Performance Computer Architecture
Autopilot: Adaptive Control of Distributed Applications
HPDC '98 Proceedings of the 7th IEEE International Symposium on High Performance Distributed Computing
A New Task Mapping Technique for Communication-Aware Scheduling Strategies
ICPPW '01 Proceedings of the 2001 International Conference on Parallel Processing Workshops
Exposing Application Alternatives
ICDCS '99 Proceedings of the 19th IEEE International Conference on Distributed Computing Systems
Analysis of Timeout-Based Adaptive Wormhole Routing
MASCOTS '01 Proceedings of the Ninth International Symposium in Modeling, Analysis and Simulation of Computer and Telecommunication Systems
LCN '03 Proceedings of the 28th Annual IEEE International Conference on Local Computer Networks
Using Dynamic Kernel Instrumentation for Kernel and Application Tuning
International Journal of High Performance Computing Applications
Real-Time Performance Monitoring, Adaptive Control, and Interactive Steering of Computational Grids
International Journal of High Performance Computing Applications
Adaptive Parallel Job Scheduling with Flexible Coscheduling
IEEE Transactions on Parallel and Distributed Systems
SC '05 Proceedings of the 2005 ACM/IEEE conference on Supercomputing
Toward a Realistic Task Scheduling Model
IEEE Transactions on Parallel and Distributed Systems
PARSE: A Tool for Parallel Application Run Time Sensitivity Evaluation
ICPADS '06 Proceedings of the 12th International Conference on Parallel and Distributed Systems - Volume 1
Modeling parallel application sensitivity to network performance
Modeling parallel application sensitivity to network performance
Network performance variability in NOW clusters
CCGRID '05 Proceedings of the Fifth IEEE International Symposium on Cluster Computing and the Grid (CCGrid'05) - Volume 2 - Volume 02
The portable batch scheduler and the maui scheduler on linux clusters
ALS'00 Proceedings of the 4th annual Linux Showcase & Conference - Volume 4
Thermal-aware task scheduling for data centers through minimizing heat recirculation
CLUSTER '07 Proceedings of the 2007 IEEE International Conference on Cluster Computing
Computer Networks: The International Journal of Computer and Telecommunications Networking
A network performance sensitivity metric for parallel applications
International Journal of High Performance Computing and Networking
PARSE 2.0: A Tool for Parallel Application Run Time Behavior Evaluation
ICDCSW '11 Proceedings of the 2011 31st International Conference on Distributed Computing Systems Workshops
A network performance sensitivity metric for parallel applications
ISPA'07 Proceedings of the 5th international conference on Parallel and Distributed Processing and Applications
Hi-index | 0.00 |
Run time variability of parallel applications continues to present significant challenges to their performance and energy efficiency in high-performance computing (HPC) systems. When run times are extended and unpredictable, application developers perceive this as a degradation of system (or subsystem) performance. Extended run times directly contribute to proportionally higher energy consumption, potentially negating efforts by applications, or the HPC system, to optimize energy consumption using low-level control techniques, such as dynamic voltage and frequency scaling (DVFS). Therefore, successful systemic management of application run time performance can result in less wasted energy, or even energy savings.We have been studying run time variability in terms of communication time, from the perspective of the application, focusing on the interconnection network. More recently, our focus has shifted to developing a more complete understanding of the effects of HPC subsystem interactions on parallel applications. In this context, the set of executing applications on the HPC system is treated as a subsystem, along with more traditional subsystems like the communication subsystem, storage subsystem, etc.To gain insight into the run time variability problem, our earlier work developed a framework to emulate parallel applications (PACE) that stresses the communication subsystem. Evaluation of run time sensitivity to network performance of real applications is performed with a tool called PARSE, which uses PACE. In this paper, we propose a model defining application-level behavioral attributes, that collectively describes how applications behave in terms of their run time performance, as functions of their process distribution on the system (spacial locality), and subsystem interactions (communication subsystem degradation). These subsystem interactions are produced when multiple applications execute concurrently on the same HPC system. We also revisit our evaluation framework and tools to demonstrate the flexibility of our application characterization techniques, and the ease with which attributes can be quantified. The validity of the model is demonstrated using our tools with several parallel benchmarks and application fragments. Results suggest that it is possible to articulate application-level behavioral attributes as a tuple of numeric values that describe course-grained performance behavior.