Automatic instruction scheduler retargeting by reverse-engineering
Proceedings of the 2006 ACM SIGPLAN conference on Programming language design and implementation
P-Ray: A Software Suite for Multi-core Architecture Characterization
Languages and Compilers for Parallel Computing
Automatic measurement of instruction cache capacity
LCPC'05 Proceedings of the 18th international conference on Languages and Compilers for Parallel Computing
The Servet 3.0 benchmark suite: Characterization of network performance degradation
Computers and Electrical Engineering
Hi-index | 0.00 |
There is growing interest in self-optimizing computing systems that can optimize their own behavior on different platforms without manual intervention. Examples of successful self-optimizing systems are ATLAS, which generates Basic Linear Algebra Subroutine (BLAS) Libraries, and FFTW, which generates FFT libraries. Self-optimizing systems need values for hardware parameters such as the number of registers of various types and the capacities of caches at various levels. For example, ATLAS uses the capacity of the L1 cache and the number of registers in determining the size of cache tiles and register tiles. In this paper, we describe X-Ray1, a system for implementing micro-benchmarks to measure such hardware parameters. We also present novel algorithms for measuring some of these parameters. Experimental evaluations of X-Ray on traditional workstations, servers and embedded systems show that X-Ray produces more accurate and complete results than existing tools.