Message-Oriented Middleware with QoS Awareness
ICSOC-ServiceWave '09 Proceedings of the 7th International Joint Conference on Service-Oriented Computing
Adaptive system anomaly prediction for large-scale hosting infrastructures
Proceedings of the 29th ACM SIGACT-SIGOPS symposium on Principles of distributed computing
Finding semantics in time series
Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
OLIC: online information compression for scalable hosting infrastructure monitoring
Proceedings of the Nineteenth International Workshop on Quality of Service
Anomaly localization for network data streams with graph joint sparse PCA
Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
Separating Performance Anomalies from Workload-Explained Failures in Streaming Servers
CCGRID '12 Proceedings of the 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (ccgrid 2012)
Proceedings of the 9th international conference on Autonomic computing
Performance troubleshooting in data centers: an annotated bibliography?
ACM SIGOPS Operating Systems Review
Hi-index | 0.00 |
In this paper, we present a stream-based mining algorithm for online anomaly prediction. Many real-world applications such as data stream analysis requires continuous cluster operation. Unfortunately, today's large-scale cluster systems are still vulnerable to various software and hardware problems. System administrators are often overwhelmed by the tasks of correcting various system anomalies such as processing bottlenecks (i.e., full stream buffers), resource hot spots, and service level objective (SLO) violations. Our anomaly prediction scheme raises early alerts for impending system anomalies and suggests possible anomaly causes. Specifically, we employ Bayesian classification methods to capture different anomaly symptoms and infer anomaly causes. Markov models are introduced to capture the changing patterns of different measurement metrics. More importantly, our scheme combines Markov models and Bayesian classification methods to predict when a system anomaly will appear in the foreseeable future and what are the possible anomaly causes. To the best of our knowledge, our work provides the first stream-based mining algorithm for predicting system anomalies. We have implemented our approach within the IBM System S distributed stream processing cluster, and conducted case study experiments using fully implemented distributed data analysis applications processing real application workloads. Our experiments show that our approach efficiently predicts and diagnoses severalbottleneck anomalies with high accuracy while imposing low overhead to the cluster system.