EbAT: online methods for detecting utility cloud anomalies

  • Authors:
  • Chengwei Wang

  • Affiliations:
  • Georgia Institute of Technology, Atlanta, GA

  • Venue:
  • Proceedings of the 6th Middleware Doctoral Symposium
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

The online detection of anomalies is a vital element of operations in datacenters and in utility clouds like Amazon EC2. Given ever-increasing data center sizes coupled with the complexities of systems software, applications, and workload patterns, such anomaly detection must operate automatically at runtime and without the need for knowledge about normal or anomalous behaviors. Further, detection should function for different levels of abstraction like hardware and software, and for the multiple metrics used in cloud computing systems. This paper proposes EbAT -- Entropy-based Anomaly Testing -- offering novel methods that detect anomalies by analyzing for arbitrary metrics their distributions rather than individual metric thresholds. Entropy is used as a measurement that captures the degree of dispersal or concentration of such distributions, aggregating raw metric data across the cloud stack to form entropy time series. For scalability, such time series can then be combined hierarchically and across multiple cloud subsystems. Finally, online tools -- time series analysis, signal processing or subspace method -- are used to identify anomalies in entropy time series (matrices) in each subsystem or at each level of hierarchy. One outcome is our ability to 'zoom in' to the components and metrics where anomalies may be originating. Experimental results demonstrate the viability of the approach, with future experimentation focusing on scalable operation as well as on further reliability evaluation and improvement.