Detection of abrupt changes: theory and application
Detection of abrupt changes: theory and application
Artificial Intelligence
Correlation-based Feature Selection for Discrete and Numeric Class Machine Learning
ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Performance debugging for distributed systems of black boxes
SOSP '03 Proceedings of the nineteenth ACM symposium on Operating systems principles
Xen and the art of virtualization
SOSP '03 Proceedings of the nineteenth ACM symposium on Operating systems principles
Pattern Classification (2nd Edition)
Pattern Classification (2nd Edition)
I/O system performance debugging using model-driven anomaly characterization
FAST'05 Proceedings of the 4th conference on USENIX Conference on File and Storage Technologies - Volume 4
Path-based faliure and evolution management
NSDI'04 Proceedings of the 1st conference on Symposium on Networked Systems Design and Implementation - Volume 1
OSDI'04 Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - Volume 6
Using magpie for request extraction and workload modelling
OSDI'04 Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - Volume 6
Model-based resource provisioning in a web service utility
USITS'03 Proceedings of the 4th conference on USENIX Symposium on Internet Technologies and Systems - Volume 4
Exploiting nonstationarity for performance prediction
Proceedings of the 2nd ACM SIGOPS/EuroSys European Conference on Computer Systems 2007
Adaptive control of virtualized resources in utility computing environments
Proceedings of the 2nd ACM SIGOPS/EuroSys European Conference on Computer Systems 2007
Fingerprinting the datacenter: automated classification of performance crises
Proceedings of the 5th European conference on Computer systems
Diagnosing performance changes by comparing request flows
Proceedings of the 8th USENIX conference on Networked systems design and implementation
Fay: extensible distributed tracing from kernels to clusters
SOSP '11 Proceedings of the Twenty-Third ACM Symposium on Operating Systems Principles
PREPARE: Predictive Performance Anomaly Prevention for Virtualized Cloud Systems
ICDCS '12 Proceedings of the 2012 IEEE 32nd International Conference on Distributed Computing Systems
Hi-index | 0.00 |
Many business customers hesitate to move all their applications to the cloud due to performance concerns. White-box diagnosis relies on human expert experience or performance troubleshooting "cookbooks" to find potential performance bottlenecks. Despite wide adoption, the scalability and adaptivity of such approaches remain severely constrained, especially in a highly-dynamic, consolidated cloud environment. Leveraging the rich telemetry collected from applications and systems in the cloud, and the power of statistical learning, vPerfGuard complements the existing approaches with a model-driven framework by: (1) automatically identifying system metrics that are most predictive of application performance, and (2) adaptively detecting changes in the performance and potential shifts in the predictive metrics that may accompany such a change. Although correlation does not imply causation, the predictive system metrics point to potential causes that can guide a cloud service provider to zero in on the root cause. We have implemented vPerfGuard as a combination of three modules: a sensor module, a model building module, and a model updating module. We evaluate its effectiveness using different benchmarks and different workload types, specifically focusing on various resource (CPU, memory, disk I/O) contention scenarios that are caused by workload surges or "noisy neighbors". The results show that vPerfGuard automatically points to the correct performance bottleneck in each scenario, including the type of the contended resource and the host where the contention occurred.