Measuring Cache and TLB Performance and Their Effect on Benchmark Runtimes
IEEE Transactions on Computers
Automatic measurement of memory hierarchy parameters
SIGMETRICS '05 Proceedings of the 2005 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
mhz: anatomy of a micro-benchmark
ATEC '98 Proceedings of the annual conference on USENIX Annual Technical Conference
lmbench: portable tools for performance analysis
ATEC '96 Proceedings of the 1996 annual conference on USENIX Annual Technical Conference
Automatic measurement of instruction cache capacity
LCPC'05 Proceedings of the 18th international conference on Languages and Compilers for Parallel Computing
Hi-index | 0.00 |
DARPA's AACE project aimed to develop Architecture Aware Compiler Environments that automatically characterize the hardware and optimize the application codes accordingly. We present the BlackjackBench suite, a collection of portable micro-benchmarks that automate system characterization, plus statistical analysis techniques for interpreting the results. The BlackjackBench discovers the effective sizes and speeds of the hardware environment rather than the often unattainable peak values. We aim at hardware features that can be observed by running executables generated by existing compilers from standard C codes. We characterize the memory hierarchy, including cache sharing and NUMA characteristics of the system, properties of the processing cores affecting execution speed, and the length of the OS scheduler time slot. We show how these features of modern multicores can be discovered programmatically. We also show how the features could interfere with each other resulting in incorrect interpretation of the results, and how established classification and statistical analysis techniques reduce experimental noise and aid automatic interpretation of results.