On reducing load/store latencies of cache accesses

  • Authors:
  • Yuan-Shin Hwang;Jia-Jhe Li

  • Affiliations:
  • Department of Computer Science and Information Engineering, National Taiwan University of Science and Technology, Taipei 106, Taiwan;Department of Computer Science, National Tsing Hua University, Hsinchu 300, Taiwan

  • Venue:
  • Journal of Systems Architecture: the EUROMICRO Journal
  • Year:
  • 2010

Quantified Score

Hi-index 0.01

Visualization

Abstract

Effective address calculations for load and store instructions need to compete for ALU with other instructions and hence extra latencies might be incurred to data cache accesses. Fast address generation is an approach proposed to reduce cache access latencies. This paper presents a fast address generator that can eliminate most of the effective address computations by storing computed effective addresses of previous load/store instructions in a dummy register file. Experimental results show that this fast address generator can reduce effective address computations of load and store instructions by about 74% on average for SPECint2000 benchmarks and cut the execution times by 8.5%. Furthermore, when multiple dummy register files are deployed, this fast address generator eliminates over 90% of effective address computations of load and store instructions and improves the average execution times by 9.3%.