Causality: models, reasoning, and inference
Causality: models, reasoning, and inference
A scalable cross-platform infrastructure for application performance tuning using hardware counters
Proceedings of the 2000 ACM/IEEE conference on Supercomputing
A Comparison of Counting and Sampling Modes of Using Performance Monitoring Hardware
ICCS '02 Proceedings of the International Conference on Computational Science-Part II
Variability in Architectural Simulations of Multi-Threaded Workloads
HPCA '03 Proceedings of the 9th International Symposium on High-Performance Computer Architecture
Myths and realities: the performance impact of garbage collection
Proceedings of the joint international conference on Measurement and modeling of computer systems
Understanding the behavior of compiler optimizations
Software—Practice & Experience - Research Articles
The M5 Simulator: Modeling Networked Systems
IEEE Micro
The DaCapo benchmarks: java benchmarking development and analysis
Proceedings of the 21st annual ACM SIGPLAN conference on Object-oriented programming systems, languages, and applications
Statistically rigorous java performance evaluation
Proceedings of the 22nd annual ACM SIGPLAN conference on Object-oriented programming systems and applications
Reducing Performance Evaluation Sensitivity and Variability by Input Shaking
MASCOTS '07 Proceedings of the 2007 15th International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems
Blind Optimization for Exploiting Hardware Features
CC '09 Proceedings of the 18th International Conference on Compiler Construction: Held as Part of the Joint European Conferences on Theory and Practice of Software, ETAPS 2009
Raced profiles: efficient selection of competing compiler optimizations
Proceedings of the 2009 ACM SIGPLAN/SIGBED conference on Languages, compilers, and tools for embedded systems
Binary analysis for measurement and attribution of program performance
Proceedings of the 2009 ACM SIGPLAN conference on Programming language design and implementation
Proceedings of the 24th ACM SIGPLAN conference on Object oriented programming systems languages and applications
Diagnosing performance bottlenecks in emerging petascale applications
Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis
Early experience with a commercial hardware transactional memory implementation
Early experience with a commercial hardware transactional memory implementation
VM performance evaluation with functional models: an optimist's outlook
Proceedings of the Third Workshop on Virtual Machines and Intermediate Languages
Studying microarchitectural structures with object code reordering
Proceedings of the Workshop on Binary Instrumentation and Applications
Evaluating the accuracy of Java profilers
PLDI '10 Proceedings of the 2010 ACM SIGPLAN conference on Programming language design and implementation
Evaluating iterative optimization across 1000 datasets
PLDI '10 Proceedings of the 2010 ACM SIGPLAN conference on Programming language design and implementation
Automated program repair through the evolution of assembly code
Proceedings of the IEEE/ACM international conference on Automated software engineering
What can the GC compute efficiently?: a language for heap assertions at GC time
Proceedings of the ACM international conference on Object oriented programming systems languages and applications
Workshop on experimental evaluation of software and systems in computer science (Evaluate 2010)
Proceedings of the ACM international conference companion on Object oriented programming systems languages and applications companion
Exact temporal characterization of 10 Gbps optical wide-area network
IMC '10 Proceedings of the 10th ACM SIGCOMM conference on Internet measurement
Collective optimization: A practical collaborative approach
ACM Transactions on Architecture and Code Optimization (TACO)
An empirical assessment of approaches to distributed enforcement in role-based access control (RBAC)
Proceedings of the first ACM conference on Data and application security and privacy
Memory system performance in a NUMA multicore multiprocessor
Proceedings of the 4th Annual International Conference on Systems and Storage
Disks are like snowflakes: no two are alike
HotOS'13 Proceedings of the 13th USENIX conference on Hot topics in operating systems
Proceedings of the international symposium on Memory management
Counting messages as a proxy for average execution time in pharo
Proceedings of the 25th European conference on Object-oriented programming
Automated GUI performance testing
Software Quality Control
Repeatability, reproducibility, and rigor in systems research
EMSOFT '11 Proceedings of the ninth ACM international conference on Embedded software
A literate experimentation manifesto
Proceedings of the 10th SIGPLAN symposium on New ideas, new paradigms, and reflections on programming and software
Hardware performance monitoring for the rest of us: a position and survey
NPC'11 Proceedings of the 8th IFIP international conference on Network and parallel computing
Compiler mitigations for time attacks on modern x86 processors
ACM Transactions on Architecture and Code Optimization (TACO) - HIPEAC Papers
Compiler techniques to improve dynamic branch prediction for indirect jump and call instructions
ACM Transactions on Architecture and Code Optimization (TACO) - HIPEAC Papers
Proceedings of the 9th International Conference on Principles and Practice of Programming in Java
A microbenchmark case study and lessons learned
Proceedings of the compilation of the co-located workshops on DSM'11, TMC'11, AGERE!'11, AOOPES'11, NEAT'11, & VMIL'11
Measurement and dynamical analysis of computer performance data
IDA'10 Proceedings of the 9th international conference on Advances in Intelligent Data Analysis
Can linear approximation improve performance prediction ?
EPEW'11 Proceedings of the 8th European conference on Computer Performance Engineering
Computer memory: why we should care what is under the hood
MEMICS'11 Proceedings of the 7th international conference on Mathematical and Engineering Methods in Computer Science
MAO -- An extensible micro-architectural optimizer
CGO '11 Proceedings of the 9th Annual IEEE/ACM International Symposium on Code Generation and Optimization
Predicting performance via automated feature-interaction detection
Proceedings of the 34th International Conference on Software Engineering
Deconstructing iterative optimization
ACM Transactions on Architecture and Code Optimization (TACO)
Kitsune: efficient, general-purpose dynamic software updating for C
Proceedings of the ACM international conference on Object oriented programming systems languages and applications
From relational verification to SIMD loop synthesis
Proceedings of the 18th ACM SIGPLAN symposium on Principles and practice of parallel programming
R3: repeatability, reproducibility and rigor
ACM SIGPLAN Notices - Supplemental issue
Why you should care about quantile regression
Proceedings of the eighteenth international conference on Architectural support for programming languages and operating systems
STABILIZER: statistically sound performance evaluation
Proceedings of the eighteenth international conference on Architectural support for programming languages and operating systems
A proper performance evaluation system that summarizes code placement effects
Proceedings of the 11th ACM SIGPLAN-SIGSOFT Workshop on Program Analysis for Software Tools and Engineering
Rigorous benchmarking in reasonable time
Proceedings of the 2013 international symposium on memory management
DataMill: rigorous performance evaluation made easy
Proceedings of the 4th ACM/SPEC International Conference on Performance Engineering
Parallelism profiling and wall-time prediction for multi-threaded applications
Proceedings of the 4th ACM/SPEC International Conference on Performance Engineering
Assessing computer performance with stocs
Proceedings of the 4th ACM/SPEC International Conference on Performance Engineering
Probabilistic timing analysis on conventional cache designs
Proceedings of the Conference on Design, Automation and Test in Europe
A study of performance variations in the Mozilla Firefox web browser
ACSC '13 Proceedings of the Thirty-Sixth Australasian Computer Science Conference - Volume 135
Post-compiler software optimization for reducing energy
Proceedings of the 19th international conference on Architectural support for programming languages and operating systems
Revisiting memory management on virtualized environments
ACM Transactions on Architecture and Code Optimization (TACO)
Towards software performance engineering for multicore and manycore systems
ACM SIGMETRICS Performance Evaluation Review
Scheduler vulnerabilities and coordinated attacks in cloud computing
Journal of Computer Security
Hi-index | 0.00 |
This paper presents a surprising result: changing a seemingly innocuous aspect of an experimental setup can cause a systems researcher to draw wrong conclusions from an experiment. What appears to be an innocuous aspect in the experimental setup may in fact introduce a significant bias in an evaluation. This phenomenon is called measurement bias in the natural and social sciences. Our results demonstrate that measurement bias is significant and commonplace in computer system evaluation. By significant we mean that measurement bias can lead to a performance analysis that either over-states an effect or even yields an incorrect conclusion. By commonplace we mean that measurement bias occurs in all architectures that we tried (Pentium 4, Core 2, and m5 O3CPU), both compilers that we tried (gcc and Intel's C compiler), and most of the SPEC CPU2006 C programs. Thus, we cannot ignore measurement bias. Nevertheless, in a literature survey of 133 recent papers from ASPLOS, PACT, PLDI, and CGO, we determined that none of the papers with experimental results adequately consider measurement bias. Inspired by similar problems and their solutions in other sciences, we describe and demonstrate two methods, one for detecting (causal analysis) and one for avoiding (setup randomization) measurement bias.