The multiflow trace scheduling compiler
The Journal of Supercomputing - Special issue on instruction-level parallelism
Analysis techniques for predicated code
Proceedings of the 29th annual ACM/IEEE international symposium on Microarchitecture
Global predicate analysis and its application to register allocation
Proceedings of the 29th annual ACM/IEEE international symposium on Microarchitecture
Precise miss analysis for program transformations with caches of arbitrary associativity
Proceedings of the eighth international conference on Architectural support for programming languages and operating systems
On the importance of points-to analysis and other memory disambiguation methods for C programs
Proceedings of the ACM SIGPLAN 2001 conference on Programming language design and implementation
Introducing the IA-64 Architecture
IEEE Micro
The Intel IA-64 Compiler Code Generator
IEEE Micro
R. Barua, W. Lee, S. Amarasinghe and A. Agarwal
HIPC '98 Proceedings of the Fifth International Conference on High Performance Computing
Ispike: A Post-link Optimizer for the Intel®Itanium®Architecture
Proceedings of the international symposium on Code generation and optimization: feedback-directed and runtime optimization
Custom Data Layout for Memory Parallelism
Proceedings of the international symposium on Code generation and optimization: feedback-directed and runtime optimization
Automatic instruction scheduler retargeting by reverse-engineering
Proceedings of the 2006 ACM SIGPLAN conference on Programming language design and implementation
Latency-tolerant software pipelining in a production compiler
Proceedings of the 6th annual IEEE/ACM international symposium on Code generation and optimization
Hi-index | 0.00 |
This paper describes scheduling optimizations in the Intel® Itanium® compiler to prevent cache penalties due to various micro-architectural effects on the Itanium 2 processor. This paper does not try to improve cache hit rates but to avoid penalties, which probably all processors have in one form or another, even in the case of cache hits. These optimizations make use of sophisticated methods for disambiguation of memory references, and this paper examines the performance improvement obtained by integrating these methods into the cache optimizations.