Profiling services for resource optimization and capacity planning in distributed systems

  • Authors:
  • Guofei Jiang;Haifeng Chen;Kenji Yoshihira

  • Affiliations:
  • NEC Laboratories America, Princeton, USA 08540;NEC Laboratories America, Princeton, USA 08540;NEC Laboratories America, Princeton, USA 08540

  • Venue:
  • Cluster Computing
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

The capacity needs of online services are mainly determined by the volume of user loads. For large-scale distributed systems running such services, it is quite difficult to match the capacities of various system components. In this paper, a novel and systematic approach is proposed to profile services for resource optimization and capacity planning. We collect resource consumption related measurements from various components across distributed systems and further search for constant relationships between these measurements. If such relationships always hold under various workloads along time, we consider them as invariants of the underlying system. After extracting many invariants from the system, given any volume of user loads, we can follow these invariant relationships sequentially to estimate the capacity needs of individual components. By comparing the current resource configurations against the estimated capacity needs, we can discover the weakest points that may deteriorate system performance. Operators can consult such analytical results to optimize resource assignments and remove potential performance bottlenecks. In this paper, we propose several algorithms to support capacity analysis and guide operator's capacity planning tasks. Our algorithms are evaluated with real systems and experimental results are also included to demonstrate the effectiveness of our approach.