Program optimization for instruction caches
ASPLOS III Proceedings of the third international conference on Architectural support for programming languages and operating systems
Profile guided code positioning
PLDI '90 Proceedings of the ACM SIGPLAN 1990 conference on Programming language design and implementation
Design and evaluation of a compiler algorithm for prefetching
ASPLOS V Proceedings of the fifth international conference on Architectural support for programming languages and operating systems
ACM Letters on Programming Languages and Systems (LOPLAS)
Improving the cache locality of memory allocation
PLDI '93 Proceedings of the ACM SIGPLAN 1993 conference on Programming language design and implementation
Cache performance of garbage-collected programs
PLDI '94 Proceedings of the ACM SIGPLAN 1994 conference on Programming language design and implementation
Static branch frequency and program profile analysis
MICRO 27 Proceedings of the 27th annual international symposium on Microarchitecture
Reducing false sharing on shared memory multiprocessors through compile time data transformations
PPOPP '95 Proceedings of the fifth ACM SIGPLAN symposium on Principles and practice of parallel programming
Using generational garbage collection to implement cache-conscious data placement
Proceedings of the 1st international symposium on Memory management
Cache-conscious data placement
Proceedings of the eighth international conference on Architectural support for programming languages and operating systems
Cache-conscious structure layout
Proceedings of the ACM SIGPLAN 1999 conference on Programming language design and implementation
Cache-conscious structure definition
Proceedings of the ACM SIGPLAN 1999 conference on Programming language design and implementation
Automated data-member layout of heap objects to improve memory-hierarchy performance
ACM Transactions on Programming Languages and Systems (TOPLAS)
An efficient profile-analysis framework for data-layout optimizations
POPL '02 Proceedings of the 29th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
System Five Application Binary Interface
System Five Application Binary Interface
HP Caliper: A Framework for Performance Analysis Tools
IEEE Concurrency
False Sharing and Spatial Locality in Multiprocessor Caches
IEEE Transactions on Computers
Data remapping for design space optimization of embedded memory systems
ACM Transactions on Embedded Computing Systems (TECS)
Optimization opportunities created by global data reordering
Proceedings of the international symposium on Code generation and optimization: feedback-directed and runtime optimization
A compiler framework for restructuring data declarations to enhance cache and TLB effectiveness
CASCON '94 Proceedings of the 1994 conference of the Centre for Advanced Studies on Collaborative research
Improving Cache Behavior of Dynamically Allocated Data Structures
PACT '98 Proceedings of the 1998 International Conference on Parallel Architectures and Compilation Techniques
SYZYGY - A Framework for Scalable Cross-Module IPO
Proceedings of the international symposium on Code generation and optimization: feedback-directed and runtime optimization
A data locality optimizing algorithm
ACM SIGPLAN Notices - Best of PLDI 1979-1999
Array regrouping and structure splitting using whole-program reference affinity
Proceedings of the ACM SIGPLAN 2004 conference on Programming language design and implementation
The garbage collection advantage: improving program locality
OOPSLA '04 Proceedings of the 19th annual ACM SIGPLAN conference on Object-oriented programming, systems, languages, and applications
Automatic pool allocation: improving performance by controlling data structure layout in the heap
Proceedings of the 2005 ACM SIGPLAN conference on Programming language design and implementation
Practical Structure Layout Optimization and Advice
Proceedings of the International Symposium on Code Generation and Optimization
Structure Layout Optimization for Multithreaded Programs
Proceedings of the International Symposium on Code Generation and Optimization
A compiler framework for general memory layout optimizations targeting structures
Proceedings of the 2010 Workshop on Interaction between Compilers and Computer Architecture
Hi-index | 0.00 |
On machines with high-performance processors, the memory system continues to be a performance bottleneck. Compilers insert prefetch operations and reorder data accesses to improve locality, but increasingly seek to modify an application's data layout to reduce cache miss and page fault penalties. In this paper we discuss Global Variable Layout (GVL), an optimization of the placement of entire static global data objects in the binary. We describe two practical methods for GVL in the HP-UX Integrity optimizing compiler for the Itanium © architecture. The first layout strategy relies on profile feedback, collaboratively employing the compiler, the linker and a pre-link tool to facilitate reordering. The second strategy uses whole-program analysis to drive data layout decisions, and does not require the use of a dynamic profile. We give a detailed description of our implementation and evaluate its performance for the SPEC integer benchmark programs, as well as for a large commercial database application.