DataMill: rigorous performance evaluation made easy

Authors:
Augusto Born de Oliveira;Jean-Christophe Petkovich;Thomas Reidemeister;Sebastian Fischmeister
Affiliations:
University of Waterloo, Waterloo, ON, Canada;University of Waterloo, Waterloo, ON, Canada;University of Waterloo, Waterloo, ON, Canada;University of Waterloo, Waterloo, ON, Canada
Venue:
Proceedings of the 4th ACM/SPEC International Conference on Performance Engineering
Year:
2013

Citing 20
Cited 0

Computing as a discipline

Communications of the ACM
Experimental evaluation in computer science: a quantitative study

Journal of Systems and Software
Dhrystone: a synthetic systems programming benchmark

Communications of the ACM
ACM president's letter: performance analysis: experimental computer science as its best

Communications of the ACM
ACM President's Letter: What is experimental computer science?

Communications of the ACM
Condor: a distributed job scheduler

Beowulf cluster computing with Linux
Should Computer Scientists Experiment More?

Computer
Is computer science science?

Communications of the ACM - Transforming China
The design principles of PlanetLab

ACM SIGOPS Operating Systems Review
PlanetFlow: maintaining accountability for network services

ACM SIGOPS Operating Systems Review
Statistically rigorous java performance evaluation

Proceedings of the 22nd annual ACM SIGPLAN conference on Object-oriented programming systems and applications
Experience-driven experimental systems research

Communications of the ACM
Experiences building PlanetLab

OSDI '06 Proceedings of the 7th symposium on Operating systems design and implementation
Everlab: a production platform for research in network experimentation and computation

LISA'07 Proceedings of the 21st conference on Large Installation System Administration Conference
Producing wrong data without doing anything obviously wrong!

Proceedings of the 14th international conference on Architectural support for programming languages and operating systems
Open CirrusTMcloud computing testbed: federated data centers for open source systems and services research

HotCloud'09 Proceedings of the 2009 conference on Hot topics in cloud computing
Handles revisited: optimising performance and memory costs in a real-time collector

Proceedings of the international symposium on Memory management
Repeatability, reproducibility, and rigor in systems research

EMSOFT '11 Proceedings of the ninth ACM international conference on Embedded software
Precise regression benchmarking with random effects: improving mono benchmark results

EPEW'06 Proceedings of the Third European conference on Formal Methods and Stochastic Models for Performance Evaluation
Our troubles with Linux and why you should care

Proceedings of the Second Asia-Pacific Workshop on Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

Empirical systems research is facing a dilemma. Minor aspects of an experimental setup can have a significant impact on its associated performance measurements and potentially invalidate conclusions drawn from them. Examples of such influences, often called hidden factors, include binary link order, process environment size, compiler generated randomized symbol names, or group scheduler assignments. The growth in complexity and size of modern systems will further aggravate this dilemma, especially with the given time pressure of producing results. So how can one trust any reported empirical analysis of a new idea or concept in computer science? This paper introduces DataMill, a community-based easy-to-use services-oriented open benchmarking infrastructure for performance evaluation. DataMill facilitates producing robust, reliable, and reproducible results. The infrastructure incorporates the latest results on hidden factors and automates the variation of these factors. Multiple research groups already participate in DataMill. DataMill is also of interest for research on performance evaluation. The infrastructure supports quantifying the effect of hidden factors, disseminating the research results beyond mere reporting. It provides a platform for investigating interactions and composition of hidden factors.