HiLighter: automatically building robust signatures of performance behavior for small- and large-scale systems

Authors:
Peter Bodík;Moises Goldszmidt;Armando Fox
Affiliations:
RAD Lab, EECS Department, UC Berkeley;Microsoft Research, Silicon Valley;RAD Lab, EECS Department, UC Berkeley
Venue:
SysML'08 Proceedings of the Third conference on Tackling computer systems problems with machine learning techniques
Year:
2008

Citing 6
Cited 4

Bayesian Network Classifiers

Machine Learning - Special issue on learning with probabilistic representations
Feature selection, L1 vs. L2 regularization, and rotational invariance

ICML '04 Proceedings of the twenty-first international conference on Machine learning
Capturing, indexing, clustering, and retrieving system history

Proceedings of the twentieth ACM symposium on Operating systems principles
Scalable training of L1-regularized log-linear models

Proceedings of the 24th international conference on Machine learning
An Interior-Point Method for Large-Scale l1-Regularized Logistic Regression

The Journal of Machine Learning Research
DryadLINQ: a system for general-purpose distributed data-parallel computing using a high-level language

OSDI'08 Proceedings of the 8th USENIX conference on Operating systems design and implementation

Hunting for problems with Artemis

WASL'08 Proceedings of the First USENIX conference on Analysis of system logs
Temporal data mining approaches for sustainable chiller management in data centers

ACM Transactions on Intelligent Systems and Technology (TIST)
Database scalability, elasticity, and autonomy in the cloud

DASFAA'11 Proceedings of the 16th international conference on Database systems for advanced applications - Volume Part I
UBL: unsupervised behavior learning for predicting performance anomalies in virtualized cloud systems

Proceedings of the 9th international conference on Autonomic computing

Quantified Score

Hi-index	0.00

Visualization

Abstract

Previous work showed that statistical analysis techniques could successfully be used to construct compact signatures of distinct operational problems in Internet server systems. Because signatures are amenable to well-known similarity search techniques, they can be used as a way to index past problems and identify particular operational problems as new or recurrent. In this paper we use a different statistical technique for constructing signatures (logistic regression with L1 regularization) that improves on previous work in two ways. First, our new approach works for cases where the number of features is an order of magnitude larger than the number of samples and also scales to problems with over 50,000 samples. Second, we get encouraging results regarding the stability of the models and the signatures by cross-validating the accuracy of the models from one section of the data center on another section. We validate our approach on data from an Internet service testbed and also from a production enterprise system comprising hundreds of servers in several data centers.