When average is not average: large response time fluctuations in n-tier systems

  • Authors:
  • Qingyang Wang;Yasuhiko Kanemasa;Motoyuki Kawaba;Calton Pu

  • Affiliations:
  • Georgia Institute of Technology, Atlanta, GA, USA;FUJITSU LABORATORIES LTD., Kanagawa, Japan;FUJITSU LABORATORIES LTD., Kanagawa, Japan;Georgia Institute of Technology, Atlanta, GA, USA

  • Venue:
  • Proceedings of the 9th international conference on Autonomic computing
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Simultaneously achieving good performance and high resource utilization is an important goal for production cloud environments. Through extensive measurements of an n-tier application benchmark (RUBBoS), we show that system response time frequently presents large scale fluctuations (e.g., ranging from tens of milliseconds up to tens of seconds) during periods of high resource utilization. Except the factor of bursty workload from clients, we found that the large scale response time fluctuations can be caused by some system environmental conditions (e.g., L2 cache miss, JVM garbage collection, inefficient scheduling policies) that commonly exist in n-tier applications. The impact of these system environmental conditions can largely amplify the end-to-end response time fluctuations because of the complex resource dependencies in the system. For instance, a 50ms response time increase in the database tier can be amplified to 500ms end-to-end response time increase. We evaluate three heuristics to stabilize response time fluctuations while still achieving high resource utilization in the system. Our results show that large scale response time fluctuations should be taken into account when designing effective autonomous self-scaling n-tier systems in cloud.