Precise regression benchmarking with random effects: improving mono benchmark results

Authors:
Tomas Kalibera;Petr Tuma
Affiliations:
Distributed Systems Research Group, Department of Software Engineering, Faculty of Mathematics and Physics, Charles University, Prague, Czech Republic;Distributed Systems Research Group, Department of Software Engineering, Faculty of Mathematics and Physics, Charles University, Prague, Czech Republic
Venue:
EPEW'06 Proceedings of the Third European conference on Formal Methods and Stochastic Models for Performance Evaluation
Year:
2006

Citing 4
Cited 7

CORBA Benchmarking: A Course with Hidden Obstacles

IPDPS '03 Proceedings of the 17th International Symposium on Parallel and Distributed Processing
Repeated results analysis for middleware regression benchmarking

Performance Evaluation - Performance modelling and evaluation of high-performance parallel and distributed systems
Automated Detection of Performance Regressions: The Mono Experience

MASCOTS '05 Proceedings of the 13th IEEE International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems
All of Statistics: A Concise Course in Statistical Inference

All of Statistics: A Concise Course in Statistical Inference

Automated benchmarking and analysis tool

valuetools '06 Proceedings of the 1st international conference on Performance evaluation methodolgies and tools
Reducing performance non-determinism via cache-aware page allocation strategies

Proceedings of the first joint WOSP/SIPEW international conference on Performance engineering
Repeatability, reproducibility, and rigor in systems research

EMSOFT '11 Proceedings of the ninth ACM international conference on Embedded software
Computer memory: why we should care what is under the hood

MEMICS'11 Proceedings of the 7th international conference on Mathematical and Engineering Methods in Computer Science
Capturing performance assumptions using stochastic performance logic

ICPE '12 Proceedings of the 3rd ACM/SPEC International Conference on Performance Engineering
Rigorous benchmarking in reasonable time

Proceedings of the 2013 international symposium on memory management
DataMill: rigorous performance evaluation made easy

Proceedings of the 4th ACM/SPEC International Conference on Performance Engineering

Quantified Score

Hi-index	0.00

Visualization

Abstract

Benchmarking as a method of assessing software performance is known to suffer from random fluctuations that distort the observed performance. In this paper, we focus on the fluctuations caused by compilation. We show that the design of a benchmarking experiment must reflect the existence of the fluctuations if the performance observed during the experiment is to be representative of reality We present a new statistical model of a benchmark experiment that reflects the presence of the fluctuations in compilation, execution and measurement. The model describes the observed performance and makes it possible to calculate the optimum dimensions of the experiment that yield the best precision within a given amount of time Using a variety of benchmarks, we evaluate the model within the context of regression benchmarking. We show that the model significantly decreases the number of erroneously detected performance changes in regression benchmarking