Improving the Precise Interrupt Mechanism of Software-Managed TLB Miss Handlers

Authors:
Aamer Jaleel;Bruce L. Jacob
Affiliations:
-;-
Venue:
HiPC '01 Proceedings of the 8th International Conference on High Performance Computing
Year:
2001

Citing 20
Cited 1

Instruction issue logic for high-performance, interruptable pipelined processors

ISCA '87 Proceedings of the 14th annual international symposium on Computer architecture
The interaction of architecture and operating system design

ASPLOS IV Proceedings of the fourth international conference on Architectural support for programming languages and operating systems
MIPS RISC architectures

MIPS RISC architectures
Design tradeoffs for software-managed TLBs

ISCA '93 Proceedings of the 20th annual international symposium on computer architecture
Architectural support for translation table management in large address space machines

ISCA '93 Proceedings of the 20th annual international symposium on computer architecture
The impact of architectural trends on operating system performance

SOSP '95 Proceedings of the fifteenth ACM symposium on Operating systems principles
A look at several memory management units, TLB-refill mechanisms, and page table organizations

Proceedings of the eighth international conference on Architectural support for programming languages and operating systems
Tolerating late memory traps in ILP processors

ISCA '99 Proceedings of the 26th annual international symposium on Computer architecture
Concurrent Event Handling through Multithreading

IEEE Transactions on Computers
Implementation of precise interrupts in pipelined processors

ISCA '85 Proceedings of the 12th annual international symposium on Computer architecture
Performance analysis of the Alpha 21264-based Compaq ES40 system

Proceedings of the 27th annual international symposium on Computer architecture
Circuits for wide-window superscalar processors

Proceedings of the 27th annual international symposium on Computer architecture
Interrupt Processing in Concurrent Processors

Computer
Virtual Memory: Issues of Implementation

Computer
Precise Interrupts

IEEE Micro
The MIPS R10000 Superscalar Microprocessor

IEEE Micro
Virtual Memory in Contemporary Microprocessors

IEEE Micro
Interrupt Handling for Out-of-Order Execution Processors

IEEE Transactions on Computers
In-Line Interrupt Handling for Software-Managed TLBs

ICCD '01 Proceedings of the International Conference on Computer Design: VLSI in Computers & Processors
An efficient algorithm for exploiting multiple arithmetic units

IBM Journal of Research and Development

In-Line Interrupt Handling and Lock-Up Free Translation Lookaside Buffers (TLBs)

IEEE Transactions on Computers

Quantified Score

Hi-index	0.00

Visualization

Abstract

The effects of the general-purpose precise interrupt mechanisms in use for the past few decades have received very little attention. When modern out-of-order processors handle interrupts precisely, they typically begin by flushing the pipeline to make the CPU available to execute handler instructions. In doing so, the CPU ends up flushing many instructions that have been brought in to the reorder buffer. In particular, many of these instructions have reached a very deep stage in the pipeline - representing significant work that is wasted. In addition, an overhead of several cycles can be expected in re-fetching and re-executing these instructions. This paper concentrates on improving the performance of precisely handling software managed translation lookaside buffer (TLB) interrupts, one of the most frequently occurring interrupts. This paper presents a novel method of in-lining the interrupt handler within the reorder buffer. Since the first level interrupt-handlers of TLBs are usually small, they could potentially fit in the reorder buffer along with the user-level code already there. In doing so, the instructions that would otherwise be flushed from the pipe need not be re-fetched and re-executed. Additionally, it allows for instructions independent of the exceptional instruction to continue to execute in parallel with the handler code. We simulate two different schemes of in-lining the interrupt on a processor with a 4-way out-of-order core similar to the Alpha 21264. We also analyzed the overhead of re-fetching and re-executing instructions when handling an interrupt by the traditional method. We find that our schemes significantly cut back on the number of instructions being re-fetched by 50-90%, and also provides a performance improvement of 5-25%.