A distributed processor state management architecture for large-window processors

  • Authors:
  • Isidro Gonzalez;Marco Galluzzi;Alex Veidenbaum;Marco A. Ramirez;Adrian Cristal;Mateo Valero

  • Affiliations:
  • Department of Computer Architecture, UPC, Barcelona, Spain;Department of Computer Architecture, UPC, Barcelona, Spain;Department of Computer Science, UC Irvine, USA;Center for Computing Research, IPN, México City, México;Department of Computer Architecture, BSC - CNS, Barcelona, Spain;Department of Computer Architecture, UPC, Barcelona, Spain

  • Venue:
  • Proceedings of the 41st annual IEEE/ACM International Symposium on Microarchitecture
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Processor architectures with large instruction windows have been proposed to expose more instruction-level parallelism (ILP) and increase performance. Some of the proposed architectures replace a re-order buffer (ROB) with a check-pointing mechanism and an out-of-order release of processor resources. Check-pointing, however, leads to an imprecise processor state recovery on mis-predicted branches and exceptions and re-execution of correct-path instructions after state recovery. It also requires large register files complicating renaming, allocation and release of physical registers. This paper proposes a new processor architecture called a Multi-State Processor (MSP). The MSP does not use check-pointing, avoids the above-mentioned problems, and has a fast, distributed state recovery mechanism. The MSP uses a novel register management architecture allowing implementation of large register files with simpler and more scalable register allocation, renaming, and release. It is also key to precise processor state recovery mechanism. The MSP is shown to improve IPC by 14%, on average, for integer SPEC CPU2000 benchmarks compared to a check-pointing based mechanism ([2]) when a fast and simple branch predictor is used. With a very aggressive branch predictor the IPC improvement is 1%, on average, and 3% if some of the programs are optimized for the MSP. The MSP also reduces the average number of executed instructions by 16.5% (12% for the aggressive branch predictor), mostly due to precise state recovery. This improves the MSP processor energy efficiency even though it uses a larger register file.