Capacity estimation in HPC systems: simulation approach

Authors:
A. Anghelescu;R. B. Lenin;S. Ramaswamy;K. Yoshigoe
Affiliations:
Department of Mathematics and Computer Science, Emory University, Atlanta, GA;Department of Mathematics, University of Central Arkansas, Conway, AR;Industrial Software Systems, ABB Corporate Research, Bangalore, India;Department of Computer Science, University of Arkansas at Little Rock, Little Rock, AR
Venue:
ICDCIT'11 Proceedings of the 7th international conference on Distributed computing and internet technology
Year:
2011

Citing 7
Cited 0

Capacity planning and performance modeling: from mainframes to client-server systems

Capacity planning and performance modeling: from mainframes to client-server systems
Analysis of SRPT scheduling: investigating unfairness

Proceedings of the 2001 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
An Efficient Adaptive Scheduling Scheme for Distributed Memory Multicomputers

IEEE Transactions on Parallel and Distributed Systems
The Self-Tuning dynP Job-Scheduler

IPDPS '02 Proceedings of the 16th International Parallel and Distributed Processing Symposium
Simulation Based HPC Workload Analysis

IPDPS '01 Proceedings of the 15th International Parallel & Distributed Processing Symposium
Scheduling for Parallel Supercomputing: A Historical Perspective of Achievable Utilization

IPPS/SPDP '99/JSSPP '99 Proceedings of the Job Scheduling Strategies for Parallel Processing
Adaptive Selection of Partition Size for Supercomputer Requests

IPDPS '00/JSSPP '00 Proceedings of the Workshop on Job Scheduling Strategies for Parallel Processing

Quantified Score

Hi-index	0.00

Visualization

Abstract

As HPC (high performance computing) systems are extensively employed for heavy computational problems throughout heterogeneous environments, the scale and complexity of applications raises the issue of capacity planning. A cardinal aspect of efficiency is the job scheduler in any HPC systems. The job scheduling techniques can worsen or mitigate issues such as job starvation, increased queue time, and decreased system utilization. Since the impact of scheduling techniques is dependent on the workload of a supercomputer, this research proposes to analyze various scheduling disciplines on a given workload. By simulating HPC system, for any given workload, we can find the paradigm that yields the best performance, i.e. minimizing the wait time of jobs in the queue while maximizing resource utilization. Furthermore, given a fixed configuration of a HPC system, this research can be used to determine an appropriate workload that optimizes the system's performance. The development and implementation of such complex simulation framework for HPC does not yet exist in HPC's literature. The efficiency of the proposed simulation framework is illustrated through simulation results of performance measures such as average queuing time, average number of jobs in the queue, and system utilization. These results are verified by a developed mathematical model for job load characterization.