Software Trace Cache for Commercial Applications

Authors:
Alex Ramirez;Josep Ll. Larriba-Pey;Carlos Navarro;Mateo Valero;Josep Torrellas
Affiliations:
Universidad Politecnica de Catalunya, Jordi Girona 1–3, D6, 08034 Barcelona, Spain;Universidad Politecnica de Catalunya, Jordi Girona 1–3, D6, 08034 Barcelona, Spain;Universidad Politecnica de Catalunya, Jordi Girona 1–3, D6, 08034 Barcelona, Spain;Universidad Politecnica de Catalunya, Jordi Girona 1–3, D6, 08034 Barcelona, Spain;Digital Computer Laboratory, University of Illinois at Urbana-Champaign, Urbana, Illinois, 61801
Venue:
International Journal of Parallel Programming
Year:
2002

Citing 24
Cited 1

Achieving high instruction cache performance with an optimizing compiler

ISCA '89 Proceedings of the 16th annual international symposium on Computer architecture
Profile guided code positioning

PLDI '90 Proceedings of the ACM SIGPLAN 1990 conference on Programming language design and implementation
The POSTGRES next generation database management system

Communications of the ACM
Increasing the instruction fetch rate via multiple branch prediction and a branch address cache

ICS '93 Proceedings of the 7th international conference on Supercomputing
The superblock: an effective technique for VLIW and superscalar compilation

The Journal of Supercomputing - Special issue on instruction-level parallelism
Contrasting characteristics and cache performance of technical and multi-user commercial workloads

ASPLOS VI Proceedings of the sixth international conference on Architectural support for programming languages and operating systems
Optimization of instruction fetch mechanisms for high issue rates

ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
Multiple-block ahead branch predictors

Proceedings of the seventh international conference on Architectural support for programming languages and operating systems
Trace cache: a low latency approach to high bandwidth instruction fetching

Proceedings of the 29th annual ACM/IEEE international symposium on Microarchitecture
Efficient procedure mapping using cache line coloring

Proceedings of the ACM SIGPLAN 1997 conference on Programming language design and implementation
Alternative fetch and issue policies for the trace cache fetch mechanism

MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
Procedure placement using temporal ordering information

MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
Memory system characterization of commercial workloads

Proceedings of the 25th annual international symposium on Computer architecture
Performance characterization of a Quad Pentium Pro SMP using OLTP workloads

Proceedings of the 25th annual international symposium on Computer architecture
An analysis of database workload performance on simultaneous multithreaded processors

Proceedings of the 25th annual international symposium on Computer architecture
Dynamic history-length fitting: a third level of adaptivity for branch prediction

Proceedings of the 25th annual international symposium on Computer architecture
Performance of database workloads on shared-memory systems with out-of-order processors

Proceedings of the eighth international conference on Architectural support for programming languages and operating systems
Software trace cache

ICS '99 Proceedings of the 13th international conference on Supercomputing
The Effect of Code Expanding Optimizations on Instruction Cache Design

IEEE Transactions on Computers
Optimizing instruction cache performance for operating system intensive workloads

HPCA '95 Proceedings of the 1st IEEE Symposium on High-Performance Computer Architecture
The Memory Performance of DSS Commercial Workloads in Shared-Memory Multiprocessors

HPCA '97 Proceedings of the 3rd IEEE Symposium on High-Performance Computer Architecture
Temporal-Based Procedure Reordering for Improved Instruction Cache Performance

HPCA '98 Proceedings of the 4th International Symposium on High-Performance Computer Architecture
The Effect of Program Optimization on Trace Cache Efficiency

PACT '99 Proceedings of the 1999 International Conference on Parallel Architectures and Compilation Techniques
Optimization of Instruction Fetch for Decision Support Workloads

ICPP '99 Proceedings of the 1999 International Conference on Parallel Processing

Combining code reordering and cache configuration

ACM Transactions on Embedded Computing Systems (TECS)

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper we address the important problem of instruction fetch for future wide issue superscalar processors. Our approach focuses on understanding the interaction between software and hardware techniques targeting an increase in the instruction fetch bandwidth. That is the objective, for instance, of the Hardware Trace Cache (HTC). We design a profile based code reordering technique which targets a maximization of the sequentiality of instructions, while still trying to minimize instruction cache misses. We call our software approach, Software Trace Cache (STC). We evaluate our software approach, and then compare it with the HTC and the combination of both techniques. Our results on PostgreSQL show that for large codes with few loops and deterministic execution sequences the STC offers better results than a HTC. Also, both the software and hardware approaches combine well to obtain improved results.