How not to lie with statistics: the correct way to summarize benchmark results
Communications of the ACM - The MIT Press scientific computation series
Measuring computer performance: a practitioner's guide
Measuring computer performance: a practitioner's guide
An infrastructure for adaptive dynamic optimization
Proceedings of the international symposium on Code generation and optimization: feedback-directed and runtime optimization
War of the benchmark means: time for a truce
ACM SIGARCH Computer Architecture News
The harmonic or geometric mean: does it really matter?
ACM SIGARCH Computer Architecture News
Producing wrong data without doing anything obviously wrong!
Proceedings of the 14th international conference on Architectural support for programming languages and operating systems
Characterization of DBT overhead
IISWC '09 Proceedings of the 2009 IEEE International Symposium on Workload Characterization (IISWC)
Malstone: towards a benchmark for analytics on large data clouds
Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
Evaluating indirect branch handling mechanisms in software dynamic translation systems
ACM Transactions on Architecture and Code Optimization (TACO)
On Using Pattern Matching Algorithms in MapReduce Applications
ISPA '11 Proceedings of the 2011 IEEE Ninth International Symposium on Parallel and Distributed Processing with Applications
Statistical performance comparisons of computers
HPCA '12 Proceedings of the 2012 IEEE 18th International Symposium on High-Performance Computer Architecture
Hi-index | 0.00 |
Several aspects of a computer system cause performance measurements to include random errors. Moreover, these systems are typically composed of a non-trivial combination of individual components that may cause one system to perform better or worse than another depending on the workload. Hence, properly measuring and comparing computer systems performance are non-trivial tasks. The majority of work published on recent major computer architecture conferences do not report the random errors measured on their experiments. The few remaining authors have been using only confidence intervals or standard deviations to quantify and factor out random errors. Recent publications claim that this approach could still lead to misleading conclusions. In this work, we reproduce and discuss the results obtained in previous study. Finally, we propose SToCS, a tool that integrates several statistical frameworks and facilitates the analysis of computer science experiments.