Combining code reordering and cache configuration

Authors:
Ann Gordon-Ross;Frank Vahid;Nikil Dutt
Affiliations:
University of Florida;University of California, Riverside, CA;University of California, Irvine, CA
Venue:
ACM Transactions on Embedded Computing Systems (TECS)
Year:
2013

Citing 40
Cited 0

Program optimization for instruction caches

ASPLOS III Proceedings of the third international conference on Architectural support for programming languages and operating systems
Achieving high instruction cache performance with an optimizing compiler

ISCA '89 Proceedings of the 16th annual international symposium on Computer architecture
Profile guided code positioning

PLDI '90 Proceedings of the ACM SIGPLAN 1990 conference on Programming language design and implementation
Reducing branch costs via branch alignment

ASPLOS VI Proceedings of the sixth international conference on Architectural support for programming languages and operating systems
Cache design trade-offs for power and performance optimization: a case study

ISLPED '95 Proceedings of the 1995 international symposium on Low power design
Efficient procedure mapping using cache line coloring

Proceedings of the ACM SIGPLAN 1997 conference on Programming language design and implementation
The filter cache: an energy efficient memory structure

MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
Procedure placement using temporal ordering information

MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
MediaBench: a tool for evaluating and synthesizing multimedia and communicatons systems

MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
Selective instruction compression for memory energy reduction in embedded systems

ISLPED '99 Proceedings of the 1999 international symposium on Low power electronics and design
A low power unified cache architecture providing power and performance flexibility (poster session)

ISLPED '00 Proceedings of the 2000 international symposium on Low power electronics and design
Using cache line coloring to perform aggressive procedure inlining

ACM SIGARCH Computer Architecture News - Special issue on interaction between compilers and computer architectures
Memory hierarchy reconfiguration for energy and performance in general-purpose processor architectures

Proceedings of the 33rd annual ACM/IEEE international symposium on Microarchitecture
Alto: a link-time optimizer for the Compaq alpha

Software—Practice & Experience
Software Trace Cache for Commercial Applications

International Journal of Parallel Programming
Multi-objective design space exploration using genetic algorithms

Proceedings of the tenth international symposium on Hardware/software codesign
The Effect of Code Reordering on Branch Prediction

PACT '00 Proceedings of the 2000 International Conference on Parallel Architectures and Compilation Techniques
Cache Configuration Exploration on Prototyping Platforms

RSP '03 Proceedings of the 14th IEEE International Workshop on Rapid System Prototyping (RSP'03)
Dynamic Loop Caching Meets Preloaded Loop Caching " A Hybrid Approach

ICCD '02 Proceedings of the 2002 IEEE International Conference on Computer Design: VLSI in Computers and Processors (ICCD'02)
A highly configurable cache architecture for embedded systems

Proceedings of the 30th annual international symposium on Computer architecture
Code Reorginazation for Instruction Caches

Code Reorginazation for Instruction Caches
Using a Victim Buffer in an Application-Specific Memory Hierarchy

Proceedings of the conference on Design, automation and test in Europe - Volume 1
A Self-Tuning Cache Architecture for Embedded Systems

Proceedings of the conference on Design, automation and test in Europe - Volume 1
Cache Optimization For Embedded Processor Cores: An Analytical Approach

Proceedings of the 2003 IEEE/ACM international conference on Computer-aided design
Profile-directed restructuring of operating system code

IBM Systems Journal
Software Trace Cache

IEEE Transactions on Computers
Code placement for improving dynamic branch prediction accuracy

Proceedings of the 2005 ACM SIGPLAN conference on Programming language design and implementation
Optimizing instruction cache performance of embedded systems

ACM Transactions on Embedded Computing Systems (TECS)
Exploiting Fixed Programs in Embedded Systems: A Loop Cache Example

IEEE Computer Architecture Letters
Fast and efficient partial code reordering: taking advantage of dynamic recompilatior

Proceedings of the 5th international symposium on Memory management
Dynamic code management: improving whole program code locality in managed runtimes

Proceedings of the 2nd international conference on Virtual execution environments
Accurate simulation and evaluation of code reordering

ISPASS '00 Proceedings of the 2000 IEEE International Symposium on Performance Analysis of Systems and Software
Code reordering on limited branch offset

ACM Transactions on Architecture and Code Optimization (TACO)
Spike: an optimizer for alpha/NT executables

NT'97 Proceedings of the USENIX Windows NT Workshop on The USENIX Windows NT Workshop 1997
Improving instruction locality with just-in-time code layout

NT'97 Proceedings of the USENIX Windows NT Workshop on The USENIX Windows NT Workshop 1997
Reducing startup latency in web and desktop applications

WINSYM'99 Proceedings of the 3rd conference on USENIX Windows NT Symposium - Volume 3
Guaranteeing Hits to Improve the Efficiency of a Small Instruction Cache

Proceedings of the 40th Annual IEEE/ACM International Symposium on Microarchitecture
Phase-based cache reconfiguration for a highly-configurable two-level cache hierarchy

Proceedings of the 18th ACM Great Lakes symposium on VLSI
Fast configurable-cache tuning with a unified second-level cache

IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Platune: a tuning framework for system-on-a-chip platforms

IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

The instruction cache is a popular optimization target due to the cache's high impact on system performance and power and because of the cache's predictable temporal and spatial locality. This article is an in depth study on the interaction of code reordering (a long-known technique) and cache configuration (a relatively new technique). Experimental results show that code reordering coupled with cache configuration reveals additional energy savings as high as 10--15% for several benchmarks with reduced cache area as high as 48%. To exploit these additional benefits, we architect and evaluate several design exploration heuristics for combining these two methods.