Towards characterizing cloud backend workloads: insights from Google compute clusters

Authors:
Asit K. Mishra;Joseph L. Hellerstein;Walfredo Cirne;Chita R. Das
Affiliations:
The Pennsylvania State University, University Park, PA;Google Inc., Mountain View, CA;Google Inc., Mountain View, CA;The Pennsylvania State University, University Park, PA
Venue:
ACM SIGMETRICS Performance Evaluation Review
Year:
2010

Citing 17
Cited 19

Generating representative Web workloads for network and server performance evaluation

SIGMETRICS '98/PERFORMANCE '98 Proceedings of the 1998 ACM SIGMETRICS joint international conference on Measurement and modeling of computer systems
The elusive goal of workload characterization

ACM SIGMETRICS Performance Evaluation Review
A sensitivity study of the clustering approach to workload modeling (extended abstract)

SIGMETRICS '85 Proceedings of the 1985 ACM SIGMETRICS conference on Measurement and modeling of computer systems
A statistical approach to predictive detection

Computer Networks: The International Journal of Computer and Telecommunications Networking - Special issue on selected topics in network and systems management
Session-Based Admission Control: A Mechanism for Peak Load Management of Commercial Web Sites

IEEE Transactions on Computers
Performance Evaluation of the Quadrics Interconnection Network

Cluster Computing
Job Characteristics of a Production Parallel Scientivic Workload on the NASA Ames iPSC/860

IPPS '95 Proceedings of the Workshop on Job Scheduling Strategies for Parallel Processing
Performance Evaluation with Heavy Tailed Distributions

JSSPP '01 Revised Papers from the 7th International Workshop on Job Scheduling Strategies for Parallel Processing
On the foundations of artificial workload design

SIGMETRICS '84 Proceedings of the 1984 ACM SIGMETRICS conference on Measurement and modeling of computer systems
Metric and Workload Effects on Computer Systems Evaluation

Computer
Capacity planning for MVS computer systems

ACM SIGMETRICS Performance Evaluation Review
A comprehensive model of the supercomputer workload

WWC '01 Proceedings of the Workload Characterization, 2001. WWC-4. 2001 IEEE International Workshop
A Co-Plot analysis of logs and models of parallel workloads

ACM Transactions on Modeling and Computer Simulation (TOMACS)
Characterizing Network Traffic in a Cluster-based, Multi-tier Data Center

ICDCS '07 Proceedings of the 27th International Conference on Distributed Computing Systems
Web server performance analysis using histogram workload models

Computer Networks: The International Journal of Computer and Telecommunications Networking
Parallel computer workload modeling with markov chains

JSSPP'04 Proceedings of the 10th international conference on Job Scheduling Strategies for Parallel Processing
Multi-toroidal interconnects: using additional communication links to improve utilization of parallel computers

JSSPP'04 Proceedings of the 10th international conference on Job Scheduling Strategies for Parallel Processing

Analysis and modeling of social influence in high performance computing workloads

Euro-Par'11 Proceedings of the 17th international conference on Parallel processing - Volume Part I
Modeling and synthesizing task placement constraints in Google compute clusters

Proceedings of the 2nd ACM Symposium on Cloud Computing
Mitigating the negative impact of preemption on heterogeneous MapReduce workloads

Proceedings of the 7th International Conference on Network and Services Management
Bubble-Up: increasing utilization in modern warehouse scale computers via sensible co-locations

Proceedings of the 44th Annual IEEE/ACM International Symposium on Microarchitecture
Energy efficiency for large-scale MapReduce workloads with significant interactive analysis

Proceedings of the 7th ACM european conference on Computer Systems
Projecting disk usage based on historical trends in a cloud environment

Proceedings of the 3rd workshop on Scientific Cloud Computing Date
The seven deadly sins of cloud computing research

HotCloud'12 Proceedings of the 4th USENIX conference on Hot Topics in Cloud Ccomputing
Towards realistic benchmarks for virtual infrastructure resource allocators

Proceedings of the Asia-Pacific Workshop on Systems
Interactive analytical processing in big data systems: a cross-industry study of MapReduce workloads

Proceedings of the VLDB Endowment
Towards realistic benchmarks for virtual infrastructure resource allocators

APSys'12 Proceedings of the Third ACM SIGOPS Asia-Pacific conference on Systems
A tenant-based resource allocation model for scaling Software-as-a-Service applications over cloud computing infrastructures

Future Generation Computer Systems
Host load prediction in a Google compute cloud with a Bayesian model

SC '12 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
Omega: flexible, scalable schedulers for large compute clusters

Proceedings of the 8th ACM European Conference on Computer Systems
Bubble-flux: precise online QoS management for increased utilization in warehouse scale computers

Proceedings of the 40th Annual International Symposium on Computer Architecture
Whare-map: heterogeneity in "homogeneous" warehouse-scale computers

Proceedings of the 40th Annual International Symposium on Computer Architecture
Efficient autonomic cloud computing using online discrete event simulation

Journal of Parallel and Distributed Computing
Next stop, the cloud: understanding modern web service deployment in EC2 and azure

Proceedings of the 2013 conference on Internet measurement conference
Cloud engineering is Search Based Software Engineering too

Journal of Systems and Software
Google hostload prediction based on Bayesian model with optimized feature combination

Journal of Parallel and Distributed Computing

Quantified Score

Hi-index	0.00

Visualization

Abstract

The advent of cloud computing promises highly available, efficient, and flexible computing services for applications such as web search, email, voice over IP, and web search alerts. Our experience at Google is that realizing the promises of cloud computing requires an extremely scalable backend consisting of many large compute clusters that are shared by application tasks with diverse service level requirements for throughput, latency, and jitter. These considerations impact (a) capacity planning to determine which machine resources must grow and by how much and (b) task scheduling to achieve high machine utilization and to meet service level objectives. Both capacity planning and task scheduling require a good understanding of task resource consumption (e.g., CPU and memory usage). This in turn demands simple and accurate approaches to workload classification-determining how to form groups of tasks (workloads) with similar resource demands. One approach to workload classification is to make each task its own workload. However, this approach scales poorly since tens of thousands of tasks execute daily on Google compute clusters. Another approach to workload classification is to view all tasks as belonging to a single workload. Unfortunately, applying such a coarse-grain workload classification to the diversity of tasks running on Google compute clusters results in large variances in predicted resource consumptions. This paper describes an approach to workload classification and its application to the Google Cloud Backend, arguably the largest cloud backend on the planet. Our methodology for workload classification consists of: (1) identifying the workload dimensions; (2) constructing task classes using an off-the-shelf algorithm such as k-means; (3) determining the break points for qualitative coordinates within the workload dimensions; and (4) merging adjacent task classes to reduce the number of workloads. We use the foregoing, especially the notion of qualitative coordinates, to glean several insights about the Google Cloud Backend: (a) the duration of task executions is bimodal in that tasks either have a short duration or a long duration; (b) most tasks have short durations; and (c) most resources are consumed by a few tasks with long duration that have large demands for CPU and memory.