The Latency Hiding Effectiveness of Decoupled Access/Execute Processors

  • Authors:
  • Joan-Manuel Parcerisa;Antonio González

  • Affiliations:
  • -;-

  • Venue:
  • EUROMICRO '98 Proceedings of the 24th Conference on EUROMICRO - Volume 1
  • Year:
  • 1998

Quantified Score

Hi-index 0.01

Visualization

Abstract

Several studies have demonstrated that out-of-order execution processors may not be the most adequate organization for wide issue processors due to the increasing penalties that wire delays will cause in the issue logic. The main target of out-of-order execution is to hide functional unit latencies and memory latency. However, the former can be quite effectively handled at compile time and this observation is one of the main arguments for the emerging EPIC architectures. In this paper, we demonstrate that a decoupled access/execute organization is very effective at hiding memory latency, even when it is very long. This paper presents a thorough evaluation of such processor organization.First, a generic decoupled access/execute architecture is defined and evaluated. Then the benefits of a lockup-free cache, control speculation and a store-load bypass mechanism under such architecture are evaluated. Our analysis indicates that memory latency can be almost completely hidden by such techniques.