Computer architecture: a quantitative approach
Computer architecture: a quantitative approach
Prefetching Using Markov Predictors
IEEE Transactions on Computers - Special issue on cache memory and related problems
Optimizing compilers for modern architectures: a dependence-based approach
Optimizing compilers for modern architectures: a dependence-based approach
IEEE Transactions on Computers
Experiences and Lessons Learned with a Portable Interface to Hardware Performance Counters
IPDPS '03 Proceedings of the 17th International Symposium on Parallel and Distributed Processing
Practical, transparent operating system support for superpages
ACM SIGOPS Operating Systems Review - OSDI '02: Proceedings of the 5th symposium on Operating systems design and implementation
Measuring Cache and TLB Performance and Their Effect of Benchmark Run
Measuring Cache and TLB Performance and Their Effect of Benchmark Run
mhz: anatomy of a micro-benchmark
ATEC '98 Proceedings of the annual conference on USENIX Annual Technical Conference
lmbench: portable tools for performance analysis
ATEC '96 Proceedings of the 1996 annual conference on USENIX Annual Technical Conference
Think globally, search locally
Proceedings of the 19th annual international conference on Supercomputing
Combining analytical and empirical approaches in tuning matrix transposition
Proceedings of the 15th international conference on Parallel architectures and compilation techniques
Achieving accurate and context-sensitive timing for code optimization
Software—Practice & Experience
Investigating Cache Parameters of x86 Family Processors
Proceedings of the 2009 SPEC Benchmark Workshop on Computer Performance Evaluation and Benchmarking
Corey: an operating system for many cores
OSDI'08 Proceedings of the 8th USENIX conference on Operating systems design and implementation
Automatic measurement of instruction cache capacity
LCPC'05 Proceedings of the 18th international conference on Languages and Compilers for Parallel Computing
BlackjackBench: portable hardware characterization
Proceedings of the second international workshop on Performance modeling, benchmarking and simulation of high performance computing systems
Computer memory: why we should care what is under the hood
MEMICS'11 Proceedings of the 7th international conference on Mathematical and Engineering Methods in Computer Science
USENIX ATC'12 Proceedings of the 2012 USENIX conference on Annual Technical Conference
BlackjackBench: portable hardware characterization
ACM SIGMETRICS Performance Evaluation Review
Uncovering CPU load balancing policies with harmony
Proceedings of the ACM International Conference on Computing Frontiers
Hi-index | 0.00 |
The running time of many applications is dominated by the cost of memory operations. To optimize such applications for a given platform, it is necessary to have a detailed knowledge of the memory hierarchy parameters of that platform. In practice, this information is poorly documented if at all. Moreover, there is growing interest in self-tuning, autonomic software systems that can optimize themselves for different platforms; these systems must determine memory hierarchy parameters automatically without human intervention.One solution is to use micro-benchmarks to determine the parameters of the memory hierarchy. In this paper, we argue that existing micro-benchmarks are inadequate, and present novel micro-benchmarks for determining parameters of all levels of the memory hierarchy, including registers, all data caches and the translation look-aside buffer. We have implemented these micro-benchmarks in a tool called X-Ray that can be ported easily to new platforms. We present experimental results that show that X-Ray successfully determines memory hierarchy parameters on current platforms, and compare its accuracy with that of existing tools.