Analysis of execution efficiency in the microthreaded processor UTLEON3

  • Authors:
  • Jaroslav Sykora;Leos Kafka;Martin Danek;Lukas Kohout

  • Affiliations:
  • Institute of Information Theory and Automation of the ASCR, Department of Signal Processing, Prague, Czech Republic;Institute of Information Theory and Automation of the ASCR, Department of Signal Processing, Prague, Czech Republic;Institute of Information Theory and Automation of the ASCR, Department of Signal Processing, Prague, Czech Republic;Institute of Information Theory and Automation of the ASCR, Department of Signal Processing, Prague, Czech Republic

  • Venue:
  • ARCS'11 Proceedings of the 24th international conference on Architecture of computing systems
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

We analyse an impact of long-latency instructions, the family blocksize parameter, and the thread switch modifier on execution efficiency of families of threads in a single-core configuration of the UTLEON3 processor that implements the SVP microthreading model. The analysis is supported by code execution in an FPGA implementation of the processor. By classifying long-latency operations as either pipelined (e.g. floatingpoint operations) or non-pipelined (e.g. cache faults) we show that the blocksize parameter that controls resource utilization in the microthreaded processor has profound effects when the latency is pipelined, i.e. increasing the blocksize can improve the performance. In the nonpipelined long-latency case the efficiency reaches its maximum even with a small value of blocksize beyond which it cannot improve due to occupancy of an exclusive resource (memory bus congestion). The conclusions drawn in this paper can be used to optimize code compilation for the microthreaded processor. As the compiler specifies the blocksize parameter for each family of threads individually, it can optimize the register file utilization of the processor.