Improving the Precise Interrupt Mechanism of Software-Managed TLB Miss Handlers

  • Authors:
  • Aamer Jaleel;Bruce L. Jacob

  • Affiliations:
  • -;-

  • Venue:
  • HiPC '01 Proceedings of the 8th International Conference on High Performance Computing
  • Year:
  • 2001

Quantified Score

Hi-index 0.00

Visualization

Abstract

The effects of the general-purpose precise interrupt mechanisms in use for the past few decades have received very little attention. When modern out-of-order processors handle interrupts precisely, they typically begin by flushing the pipeline to make the CPU available to execute handler instructions. In doing so, the CPU ends up flushing many instructions that have been brought in to the reorder buffer. In particular, many of these instructions have reached a very deep stage in the pipeline - representing significant work that is wasted. In addition, an overhead of several cycles can be expected in re-fetching and re-executing these instructions. This paper concentrates on improving the performance of precisely handling software managed translation lookaside buffer (TLB) interrupts, one of the most frequently occurring interrupts. This paper presents a novel method of in-lining the interrupt handler within the reorder buffer. Since the first level interrupt-handlers of TLBs are usually small, they could potentially fit in the reorder buffer along with the user-level code already there. In doing so, the instructions that would otherwise be flushed from the pipe need not be re-fetched and re-executed. Additionally, it allows for instructions independent of the exceptional instruction to continue to execute in parallel with the handler code. We simulate two different schemes of in-lining the interrupt on a processor with a 4-way out-of-order core similar to the Alpha 21264. We also analyzed the overhead of re-fetching and re-executing instructions when handling an interrupt by the traditional method. We find that our schemes significantly cut back on the number of instructions being re-fetched by 50-90%, and also provides a performance improvement of 5-25%.