A flexible architecture integrating monitoring and analytics for managing large-scale data centers

  • Authors:
  • Chengwei Wang;Karsten Schwan;Vanish Talwar;Greg Eisenhauer;Liting Hu;Matthew Wolf

  • Affiliations:
  • College of Computing, Georgia Institute of Technology, Atlanta, GA, USA;College of Computing, Georgia Institute of Technology, Atlanta, GA, USA;HP Labs, Palo Alto, CA, USA;College of Computing, Georgia Institute of Technology, Atlanta, GA, USA;College of Computing, Georgia Institute of Technology, Atlanta, GA, USA;College of Computing, Georgia Institute of Technology, Atlanta, GA, USA

  • Venue:
  • Proceedings of the 8th ACM international conference on Autonomic computing
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

To effectively manage large-scale data centers and utility clouds, operators must understand current system and application behaviors. This requires continuous, real-time monitoring along with on-line analysis of the data captured by the monitoring system, i.e., integrated monitoring and analytics -- Monalytics [28]. A key challenge with such integration is to balance the costs incurred and associated delays, against the benefits attained from identifying and reacting to, in a timely fashion, undesirable or non-performing system states. This paper presents a novel, flexible architecture for Monalytics in which such trade-offs are easily made by dynamically constructing software overlays called Distributed Computation Graphs (DCGs) to implement desired analytics functions. The prototype of Monalytics implementing this flexible architecture is evaluated with motivating use cases in small scale data center experiments, and a series of analytical models is used to understand the above trade-offs at large scales. Results show that the approach provides the flexibility needed to meet the demands of autonomic management at large scale with considerably better performance/cost than traditional and brute force solutions.