Systematic adoption of genetic programming for deriving software performance curves

  • Authors:
  • Michael Faber;Jens Happe

  • Affiliations:
  • Karlsruhe Institute of Technology (KIT), Karlsruhe, Germany;SAP Research, Karlsruhe, Germany

  • Venue:
  • ICPE '12 Proceedings of the 3rd ACM/SPEC International Conference on Performance Engineering
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Measurement-based approaches to software performance engineering apply analysis methods (e.g., statistical inference or machine learning) on raw measurement data with the goal to build a mathematical model describing the performance-relevant behavior of a system under test (SUT). The main challenge for such approaches is to find a reasonable trade-off between minimizing the amount of necessary measurement data used to build the model and maximizing the model's accuracy. Most existing methods require prior knowledge about parameter dependencies or their models are limited to only linear correlations. In this paper, we investigate the applicability of genetic programming (GP) to derive a mathematical equation expressing the performance behavior of the measured system (software performance curve). We systematically optimized the parameters of the GP algorithm to derive accurate software performance curves and applied techniques to prevent overfitting. We conducted an evaluation with a representative MySQL database system. The results clearly show that the GP algorithm outperforms other analysis techniques like inverse distance weighting (IDW) and multivariate adaptive regression splines (MARS) in terms of model accuracy.