Construction and use of multiclass workload models
Performance Evaluation
The elusive goal of workload characterization
ACM SIGMETRICS Performance Evaluation Review
Characterizing Web user sessions
ACM SIGMETRICS Performance Evaluation Review
Online Prediction of the Running Time of Tasks
Cluster Computing
Predicting the Performance of Wide Area Data Transfers
IPDPS '02 Proceedings of the 16th International Parallel and Distributed Processing Symposium
The Case for Prediction-Based Best-Effort Real-Time Systems
Proceedings of the 11 IPPS/SPDP'99 Workshops Held in Conjunction with the 13th International Parallel Processing Symposium and 10th Symposium on Parallel and Distributed Processing
A Model for Moldable Supercomputer Jobs
IPDPS '01 Proceedings of the 15th International Parallel & Distributed Processing Symposium
IPPS '97 Proceedings of the Job Scheduling Strategies for Parallel Processing
Improved Utilization and Responsiveness with Gang Scheduling
IPPS '97 Proceedings of the Job Scheduling Strategies for Parallel Processing
Workload Modeling for Performance Evaluation
Performance Evaluation of Complex Systems: Techniques and Tools, Performance 2002, Tutorial Lectures
The Characteristics of Workload on ASCI Blue-Pacific at Lawrence Livermore National Laboratory
CCGRID '01 Proceedings of the 1st International Symposium on Cluster Computing and the Grid
A parallel workload model and its implications for processor allocation
HPDC '97 Proceedings of the 6th IEEE International Symposium on High Performance Distributed Computing
HPDC '00 Proceedings of the 9th IEEE International Symposium on High Performance Distributed Computing
A load balancing algorithm using prediction
PAS '97 Proceedings of the 2nd AIZU International Symposium on Parallel Algorithms / Architecture Synthesis
A hierarchical and multiscale approach to analyze E-business workloads
Performance Evaluation
The workload on parallel supercomputers: modeling the characteristics of rigid jobs
Journal of Parallel and Distributed Computing
Improving and Stabilizing Parallel Computer Performance Using Adaptive Backfilling
IPDPS '05 Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Papers - Volume 01
Performance prediction and its use in parallel and distributed computing systems
Future Generation Computer Systems - Systems performance analysis and evaluation
Towards a profound analysis of bags-of-tasks in parallel systems and their performance impact
Proceedings of the 20th international symposium on High performance distributed computing
Measuring TeraGrid: workload characterization for a high-performance computing federation
International Journal of High Performance Computing Applications
Modeling user runtime estimates
JSSPP'05 Proceedings of the 11th international conference on Job Scheduling Strategies for Parallel Processing
Longitudinal user and usage patterns in the XSEDE user community
Proceedings of the 1st Conference of the Extreme Science and Engineering Discovery Environment: Bridging from the eXtreme to the campus and beyond
On extracting session data from activity logs
Proceedings of the 5th Annual International Systems and Storage Conference
A User-Based Model of Grid Computing Workloads
GRID '12 Proceedings of the 2012 ACM/IEEE 13th International Conference on Grid Computing
Workload resampling for performance evaluation of parallel job schedulers
Proceedings of the 4th ACM/SPEC International Conference on Performance Engineering
Hi-index | 0.00 |
Learning useful and predictable features from past workloads and exploiting them well is a major source of improvement in many operating system problems. We review known parallel workload features, and argue that the correct approach for future on-line algorithm design as well as workload modeling is user- and session-based modeling, instead of analyzing jobs directly as done today. We then provide statistically sound answers to two basic questions: Which user and session features are central enough to be potentially useful, answered using Principal Component Analysis, and which user and session classes exist and how they can be identified on-line, answered using K-means clustering. We identify variable sets that explain over 80% of the variance between sessions and between users, and also identify five stable session classes (clusters) and four stable user classes. Our analysis is based on logs from seven different parallel supercomputers, spanning over 87 months, which are analyzed together to ensure that results are location- and architecture-neutral.