UBL: unsupervised behavior learning for predicting performance anomalies in virtualized cloud systems

Authors:
Daniel Joseph Dean;Hiep Nguyen;Xiaohui Gu
Affiliations:
North Carolina State University, Raleigh, NC, USA;North Carolina State University, Raleigh, NC, USA;North Carolina State University, Raleigh, NC, USA
Venue:
Proceedings of the 9th international conference on Autonomic computing
Year:
2012

Citing 25
Cited 3

Self-Organizing Maps

Self-Organizing Maps
Xen and the art of virtualization

SOSP '03 Proceedings of the nineteenth ACM symposium on Operating systems principles
Short term performance forecasting in enterprise systems

Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
Introduction to Data Mining, (First Edition)

Introduction to Data Mining, (First Edition)
Capturing, indexing, clustering, and retrieving system history

Proceedings of the twentieth ACM symposium on Operating systems principles
WAP5: black-box performance debugging for wide-area systems

Proceedings of the 15th international conference on World Wide Web
Performance modeling and system management for multi-component online services

NSDI'05 Proceedings of the 2nd conference on Symposium on Networked Systems Design & Implementation - Volume 2
Live migration of virtual machines

NSDI'05 Proceedings of the 2nd conference on Symposium on Networked Systems Design & Implementation - Volume 2
Correlating instrumentation data to system states: a building block for automated diagnosis and control

OSDI'04 Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - Volume 6
Using magpie for request extraction and workload modelling

OSDI'04 Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - Volume 6
Exploiting nonstationarity for performance prediction

Proceedings of the 2nd ACM SIGOPS/EuroSys European Conference on Computer Systems 2007
SPADE: the system s declarative stream processing engine

Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Discovering Likely Invariants of Distributed Transaction Systems for Autonomic System Management

ICAC '06 Proceedings of the 2006 IEEE International Conference on Autonomic Computing
Online Anomaly Prediction for Robust Cluster Systems

ICDE '09 Proceedings of the 2009 IEEE International Conference on Data Engineering
Fa: A System for Automating Failure Diagnosis

ICDE '09 Proceedings of the 2009 IEEE International Conference on Data Engineering
NAP: a building block for remediating performance bottlenecks via black box network analysis

ICAC '09 Proceedings of the 6th international conference on Autonomic computing
Reference-driven performance anomaly identification

Proceedings of the eleventh international joint conference on Measurement and modeling of computer systems
Adaptive system anomaly prediction for large-scale hosting infrastructures

Proceedings of the 29th ACM SIGACT-SIGOPS symposium on Principles of distributed computing
Black-box problem diagnosis in parallel file systems

FAST'10 Proceedings of the 8th USENIX conference on File and storage technologies
SplitScreen: enabling efficient, distributed malware detection

NSDI'10 Proceedings of the 7th USENIX conference on Networked systems design and implementation
Lightweight, high-resolution monitoring for troubleshooting production systems

OSDI'08 Proceedings of the 8th USENIX conference on Operating systems design and implementation
HiLighter: automatically building robust signatures of performance behavior for small- and large-scale systems

SysML'08 Proceedings of the Third conference on Tackling computer systems problems with machine learning techniques
PAC: Pattern-driven Application Consolidation for Efficient Cloud Computing

MASCOTS '10 Proceedings of the 2010 IEEE International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems
Fast entropy based alert detection in super computer logs

DSNW '10 Proceedings of the 2010 International Conference on Dependable Systems and Networks Workshops (DSN-W)
PREPARE: Predictive Performance Anomaly Prevention for Virtualized Cloud Systems

ICDCS '12 Proceedings of the 2012 IEEE 32nd International Conference on Distributed Computing Systems

Human dynamics revealed through log analytics in a cloud computing environment

WAIM'13 Proceedings of the 14th international conference on Web-Age Information Management
Autonomous Fault Detection in Self-Healing Systems: Comparing Hidden Markov Models and Artificial Neural Networks

Proceedings of International Workshop on Adaptive Self-tuning Computing Systems
PREC: practical root exploit containment for android devices

Proceedings of the 4th ACM conference on Data and application security and privacy

Quantified Score

Hi-index	0.00

Visualization

Abstract

Infrastructure-as-a-Service (IaaS) clouds are prone to performance anomalies due to their complex nature. Although previous work has shown the effectiveness of using statistical learning to detect performance anomalies, existing schemes often assume labelled training data, which requires significant human effort and can only handle previously known anomalies. We present an Unsupervised Behavior Learning (UBL) system for IaaS cloud computing infrastructures. UBL leverages Self-Organizing Maps to capture emergent system behaviors and predict unknown anomalies. For scalability, UBL uses residual resources in the cloud infrastructure for behavior learning and anomaly prediction with little add-on cost. We have implemented a prototype of the UBL system on top of the Xen platform and conducted extensive experiments using a range of distributed systems. Our results show that UBL can predict performance anomalies with high accuracy and achieve sufficient lead time for automatic anomaly prevention. UBL supports large-scale infrastructure-wide behavior learning with negligible overhead.