Combinatorial optimization: algorithms and complexity
Combinatorial optimization: algorithms and complexity
Achieving high instruction cache performance with an optimizing compiler
ISCA '89 Proceedings of the 16th annual international symposium on Computer architecture
Evaluating Associativity in CPU Caches
IEEE Transactions on Computers
Profile guided code positioning
PLDI '90 Proceedings of the ACM SIGPLAN 1990 conference on Programming language design and implementation
Procedure merging with instruction caches
PLDI '91 Proceedings of the ACM SIGPLAN 1991 conference on Programming language design and implementation
Efficient simulation of caches under optimal replacement with applications to miss characterization
SIGMETRICS '93 Proceedings of the 1993 ACM SIGMETRICS conference on Measurement and modeling of computer systems
Compile time instruction cache optimizations
ACM SIGARCH Computer Architecture News - Special issue: panel sessions of the 1991 workshop on multithreaded computers
A data cache with multiple caching strategies tuned to different types of locality
ICS '95 Proceedings of the 9th international conference on Supercomputing
Efficient procedure mapping using cache line coloring
Proceedings of the ACM SIGPLAN 1997 conference on Programming language design and implementation
Eliminating cache conflict misses through XOR-based placement functions
ICS '97 Proceedings of the 11th international conference on Supercomputing
Procedure placement using temporal ordering information
MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
Compression-Based Program Characterization for Improving Cache Memory Performance
IEEE Transactions on Computers
Guest Editors' Introduction-Cache Memory and Related Problems: Enhancing and Exploiting the Locality
IEEE Transactions on Computers - Special issue on cache memory and related problems
A Trace Cache Microarchitecture and Evaluation
IEEE Transactions on Computers - Special issue on cache memory and related problems
Augmenting Loop Tiling with Data Alignment for Improved Cache Performance
IEEE Transactions on Computers - Special issue on cache memory and related problems
Improving Cache Locality by a Combination of Loop and Data Transformations
IEEE Transactions on Computers - Special issue on cache memory and related problems
Analysis of Temporal-Based Program Behavior for Improved Instruction Cache Performance
IEEE Transactions on Computers - Special issue on cache memory and related problems
Randomized Cache Placement for Eliminating Conflicts
IEEE Transactions on Computers - Special issue on cache memory and related problems
Optimizing the Instruction Cache Performance of the Operating System
IEEE Transactions on Computers
Cache-conscious structure layout
Proceedings of the ACM SIGPLAN 1999 conference on Programming language design and implementation
A locality sensitive multi-module cache with explicit management
ICS '99 Proceedings of the 13th international conference on Supercomputing
ICS '99 Proceedings of the 13th international conference on Supercomputing
ISCA '90 Proceedings of the 17th annual international symposium on Computer Architecture
Analysis of Cache Performance for Operating Systems and Multiprogramming
Analysis of Cache Performance for Operating Systems and Multiprogramming
The ChARM Tool for Tuning Embedded Systems
IEEE Micro
System-on-chip beyond the nanometer wall
Proceedings of the 40th annual Design Automation Conference
A highly configurable cache architecture for embedded systems
Proceedings of the 30th annual international symposium on Computer architecture
Analysis of cache replacement-algorithms
Analysis of cache replacement-algorithms
MiDataSets: creating the conditions for a more realistic evaluation of Iterative optimization
HiPEAC'07 Proceedings of the 2nd international conference on High performance embedded architectures and compilers
Improved procedure placement for set associative caches
CASES '10 Proceedings of the 2010 international conference on Compilers, architectures and synthesis for embedded systems
Combining code reordering and cache configuration
ACM Transactions on Embedded Computing Systems (TECS)
Minimizing accumulative memory load cost on multi-core DSPs with multi-level memory
Journal of Systems Architecture: the EUROMICRO Journal
Hi-index | 0.00 |
In the embedded domain, the gap between memory and processor performance and the increase in application complexity need to be supported without wasting precious system resources: die size, power, etc. For these reasons, effective exploitation of small and simple cache memories is of the utmost importance. However, programs running on such caches can experience serious inefficiencies due to cache conflicts.We present a new Cache-Aware Code Allocation Technique (CAT), which transforms the structure of programs so that their behavior toward memory can meet the locality features the cache is able to exploit. The proposed approach uses detailed information of program execution to place program areas into memory and employs the new idea of “look-forward estimation” that helps to seek better global layouts during the placement of each area. CAT-optimized programs outperform the original ones achieving the same miss rate on two times, and sometimes four times, smaller caches. Moreover, CAT improves the instruction miss rate by more than 40% if compared to the best procedure-reordering algorithm. CAT performances derive from the increased number of cache lines that support the execution of optimized applications and from a more balanced load on them.