Memory Bank Predictors

Authors:
Stefan Bieschewski;Joan-Manuel Parcerisa;Antonio Gonzalez
Affiliations:
Departament d'Arquitectura de Computadors, Universitat Politècnica de Catalunya, Barcelona, Spain;Departament d'Arquitectura de Computadors, Universitat Politècnica de Catalunya, Barcelona, Spain;Intel Barcelona Research Center, Intel Labs, Universitat Politècnica de Catalunya Barcelona, Spain
Venue:
ICCD '05 Proceedings of the 2005 International Conference on Computer Design
Year:
2005

Citing 6
Cited 1

Speculation techniques for improving load related instruction scheduling

ISCA '99 Proceedings of the 26th annual international symposium on Computer architecture
Clock rate versus IPC: the end of the road for conventional microarchitectures

Proceedings of the 27th annual international symposium on Computer architecture
Inherently Lower-Power High-Performance Superscalar Architectures

IEEE Transactions on Computers
Partitioned first-level cache design for clustered microarchitectures

ICS '03 Proceedings of the 17th annual international conference on Supercomputing
Differential FCM: Increasing Value Prediction Accuracy by Improving Table Usage Efficiency

HPCA '01 Proceedings of the 7th International Symposium on High-Performance Computer Architecture
Microarchitectural Trade-Offs in the Design of a Scalable Clustered Microprocessor

Microarchitectural Trade-Offs in the Design of a Scalable Clustered Microprocessor

Late-binding: enabling unordered load-store queues

Proceedings of the 34th annual international symposium on Computer architecture

Quantified Score

Hi-index	0.00

Visualization

Abstract

Cache memories are commonly implemented through multiple memory banks to improve bandwidth and latency. The early knowledge of the data cache bank that an instruction will access can help to improve the performance in several ways. One scenario that is likely to become increasingly important is clustered microprocessors with a distributed cache. This work presents a study of different cache bank predictors. We show that effective bank predictors can be implemented with relatively low cost. For instance, a predictor of approximately 4 Kbytes is shown to achieve an average hit rate of 78% for SPECint2000 when used to predict accesses to an 8-bank cache memory in a contemporary superscalar processor. We also show how a predictor can be used to reduce the communication latency caused by memory accesses in a clustered microarchitecture with a distributed cache design.