Instruction issue logic for high-performance, interruptable pipelined processors
ISCA '87 Proceedings of the 14th annual international symposium on Computer architecture
The interaction of architecture and operating system design
ASPLOS IV Proceedings of the fourth international conference on Architectural support for programming languages and operating systems
MIPS RISC architectures
Design tradeoffs for software-managed TLBs
ISCA '93 Proceedings of the 20th annual international symposium on computer architecture
Architectural support for translation table management in large address space machines
ISCA '93 Proceedings of the 20th annual international symposium on computer architecture
The impact of architectural trends on operating system performance
SOSP '95 Proceedings of the fifteenth ACM symposium on Operating systems principles
Reducing TLB power requirements
ISLPED '97 Proceedings of the 1997 international symposium on Low power electronics and design
Power considerations in the design of the Alpha 21264 microprocessor
DAC '98 Proceedings of the 35th annual Design Automation Conference
A look at several memory management units, TLB-refill mechanisms, and page table organizations
Proceedings of the eighth international conference on Architectural support for programming languages and operating systems
Tolerating late memory traps in ILP processors
ISCA '99 Proceedings of the 26th annual international symposium on Computer architecture
Concurrent Event Handling through Multithreading
IEEE Transactions on Computers
The use of multithreading for exception handling
Proceedings of the 32nd annual ACM/IEEE international symposium on Microarchitecture
Implementation of precise interrupts in pipelined processors
ISCA '85 Proceedings of the 12th annual international symposium on Computer architecture
Performance analysis of the Alpha 21264-based Compaq ES40 system
Proceedings of the 27th annual international symposium on Computer architecture
Circuits for wide-window superscalar processors
Proceedings of the 27th annual international symposium on Computer architecture
IEEE Micro
The MIPS R10000 Superscalar Microprocessor
IEEE Micro
Interrupt Handling for Out-of-Order Execution Processors
IEEE Transactions on Computers
Improving the Precise Interrupt Mechanism of Software-Managed TLB Miss Handlers
HiPC '01 Proceedings of the 8th International Conference on High Performance Computing
In-Line Interrupt Handling for Software-Managed TLBs
ICCD '01 Proceedings of the International Conference on Computer Design: VLSI in Computers & Processors
An EDF Interrupt Handling Scheme for Real-Time Kernel: Design and Task Simulation
ICCS '07 Proceedings of the 7th international conference on Computational Science, Part IV: ICCS 2007
A lazy EDF interrupt scheduling algorithm for multiprocessor in parallel computing environment
ICA3PP'07 Proceedings of the 7th international conference on Algorithms and architectures for parallel processing
Hi-index | 14.98 |
The effects of the general-purpose precise interrupt mechanisms in use for the past few decades have received very little attention. When modern out-of-order processors handle interrupts precisely, they typically begin by flushing the pipeline to make the CPU available to execute handler instructions. In doing so, the CPU ends up flushing many instructions that have been brought in to the reorder buffer. In particular, these instructions may have reached a very deep stage in the pipeline—representing significant work that is wasted. In addition, an overhead of several cycles and wastage of energy (per exception detected) can be expected in refetching and reexecuting the instructions flushed. This paper concentrates on improving the performance of precisely handling software managed translation look-aside buffer (TLB) interrupts, one of the most frequently occurring interrupts. The paper presents a novel method of in-lining the interrupt handler within the reorder buffer. Since the first level interrupt-handlers of TLBs are usually small, they could potentially fit in the reorder buffer along with the user-level code already there. In doing so, the instructions that would otherwise be flushed from the pipe need not be refetched and reexecuted. Additionally, it allows for instructions independent of the exceptional instruction to continue to execute in parallel with the handler code. By in-lining the TLB interrupt handler, this provides lock-up free TLBs. This paper proposes the prepend and append schemes of in-lining the interrupt handler into the available reorder buffer space. The two schemes are implemented on a performance model of the Alpha 21264 processor built by Alpha designers at the Palo Alto Design Center (PADC), California. We compare the overhead and performance impact of handling TLB interrupts by the traditional scheme, the append in-lined scheme, and the prepend in-lined scheme. For small, medium, and large memory footprints, the overhead is quantified by comparing the number and pipeline state of instructions flushed, the energy savings, and the performance improvements. We find that lock-up free TLBs reduce the overhead of refetching and reexecuting the instructions flushed by 30-95 percent, reduce the execution time by 5-25 percent, and also reduce the energy wasted by 30-90 percent.