Low load latency through sum-addressed memory (SAM)

Authors:
William L. Lynch;Gary Lauterbach;Joseph I. Chamdani
Affiliations:
Sun Microsystems, 901 San Antonio Road, MS USUN02-203, Palo Alto CA;Sun Microsystems, 901 San Antonio Road, MS USUN02-203, Palo Alto CA;Sun Microsystems, 901 San Antonio Road, MS USUN02-203, Palo Alto CA
Venue:
Proceedings of the 25th annual international symposium on Computer architecture
Year:
1998

Citing 6
Cited 5

Performance optimization of pipelined primary cache

ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
Evaluation of A+B=K Conditions Without Carry Propagation

IEEE Transactions on Computers
The SPARC architecture manual (version 9)

The SPARC architecture manual (version 9)
Shade: a fast instruction-set simulator for execution profiling

SIGMETRICS '94 Proceedings of the 1994 ACM SIGMETRICS conference on Measurement and modeling of computer systems
Streamlining data cache access with fast address calculation

ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
Cache Memories

ACM Computing Surveys (CSUR)

A permutation-based page interleaving scheme to reduce row-buffer conflicts and exploit data locality

Proceedings of the 33rd annual ACM/IEEE international symposium on Microarchitecture
Designing domain-specific processors

Proceedings of the ninth international symposium on Hardware/software codesign
Direct load: dependence-linked dataflow resolution of load address and cache coordinate

Proceedings of the 34th annual ACM/IEEE international symposium on Microarchitecture
Address-free memory access based on program syntax correlation of loads and stores

IEEE Transactions on Very Large Scale Integration (VLSI) Systems - Special section on the 2001 international conference on computer design (ICCD)
Reducing non-deterministic loads in low-power caches via early cache set resolution

Microprocessors & Microsystems

Quantified Score

Hi-index	0.00

Visualization

Abstract

Load latency contributes significantly to execution time. Because most cache accesses hit, cache-hit latency becomes an important component of expected load latency. Most modern microprocessors have base+offset addressing loads; thus effective cache-hit latency includes an addition as well as the RAM access.This paper introduces a new technique used in the UltraSPARC III microprocessor, Sum-Addressed Memory (SAM), which performs true addition using the decoder of the RAM array, with very low latency. We compare SAM with other methods for reducing the add part of load latency. These methods include sum-prediction with recovery, and bitwise indexing with duplicate-tolerance. The results demonstrate the superior performance of SAM.