Decoupled state-execute architecture

  • Authors:
  • Miquel Pericàs;Adrián Cristal;Ruben González;Alex Veidenbaum;Mateo Valero

  • Affiliations:
  • Computer Architecture Department, Technical University of Catalonia, Barcelona, Spain and Barcelona Supercomputing Center, Barcelona, Spain;Barcelona Supercomputing Center, Barcelona, Spain;Computer Architecture Department, Technical University of Catalonia, Barcelona, Spain;Department of Computer Science, University of California, Irvine, CA;Computer Architecture Department, Technical University of Catalonia, Barcelona, Spain and Barcelona Supercomputing Center, Barcelona, Spain

  • Venue:
  • ISHPC'05/ALPS'06 Proceedings of the 6th international symposium on high-performance computing and 1st international conference on Advanced low power systems
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

The majority of register file designs follow one of two well-known approaches.Manymodern high-performance processors (POWER4 [1], Pentium4 [2]) use a merged register file that holds both architectural and rename registers. Other processors use a Future File (eg, Opteron [3]) with rename registers kept separately in reservation stations. Both approaches have issues that may limit their application in futuremicroprocessors. The merged register file scales poorly in terms of powerperformance while the Future File has to pay a large penalty due on branch mis-prediction recovery. In addition, the Future File requires the use of the less scalable mechanism of reservation stations. This paper proposes to combine the best aspects of the traditional Future File architecture with those of the merged physical register file. The key point is that the new architecture separates the processor state, in particular the registers, and the execution units in the pipeline back-end. Therefore it is called Decoupled State-Execute Architecture. The resulting register file can be accessed in the pipeline front-end and has several desirable properties that allow efficient application of several optimizations, most notably the register file banking and a novel writeback filtering mechanism. As a result, only a 1.0% IPC degradation was observed with aggressive banking and the energy consumption was lowered by the new writeback filtering technique. Together, the two optimizations remove approximately 80% of the energy consumed in register file data array.