Reducing misspeculation penalty in trace-level speculative multithreaded architectures

Authors:
Carlos Molina;Jordi Tubella;Antonio González
Affiliations:
Dept. Eng. Informàtica i Matemàtiques, Universitat Rovira i Virgili, Tarragona, Spain;Dept. d'Arquitectura de Computadors, Universitat Politècnica de Catalunya, Barcelona, Spain;Dept. d'Arquitectura de Computadors, Universitat Politècnica de Catalunya, Barcelona, Spain and Intel Barcelona Research Center, Intel Labs-UPC, Barcelona, Spain
Venue:
ISHPC'05/ALPS'06 Proceedings of the 6th international symposium on high-performance computing and 1st international conference on Advanced low power systems
Year:
2005

Citing 23
Cited 0

Simultaneous multithreading: maximizing on-chip parallelism

ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
Speculative execution via address prediction and data prefetching

ICS '97 Proceedings of the 11th international conference on Supercomputing
Dynamic instruction reuse

Proceedings of the 24th annual international symposium on Computer architecture
Improving the accuracy and performance of memory communication through renaming

MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
The predictability of data values

MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
Multipath execution: opportunities and limits

ICS '98 Proceedings of the 12th international conference on Supercomputing
Value locality and speculative execution

Value locality and speculative execution
A dynamic multithreading processor

MICRO 31 Proceedings of the 31st annual ACM/IEEE international symposium on Microarchitecture
Reducing branch misprediction penalties via dynamic control independence detection

ICS '99 Proceedings of the 13th international conference on Supercomputing
Compiler-directed dynamic computation reuse: rationale and initial results

Proceedings of the 32nd annual ACM/IEEE international symposium on Microarchitecture
Value prediction for speculative multithreaded architectures

Proceedings of the 32nd annual ACM/IEEE international symposium on Microarchitecture
Efficient checker processor design

Proceedings of the 33rd annual ACM/IEEE international symposium on Microarchitecture
Register integration: a simple and efficient implementation of squash reuse

Proceedings of the 33rd annual ACM/IEEE international symposium on Microarchitecture
A study of slipstream processors

Proceedings of the 33rd annual ACM/IEEE international symposium on Microarchitecture
Speculative precomputation: long-range prefetching of delinquent loads

ISCA '01 Proceedings of the 28th annual international symposium on Computer architecture
Dynamically allocating processor resources between nearby and distant ILP

ISCA '01 Proceedings of the 28th annual international symposium on Computer architecture
AR-SMT: A Microarchitectural Approach to Fault Tolerance in Microprocessors

FTCS '99 Proceedings of the Twenty-Ninth Annual International Symposium on Fault-Tolerant Computing
Exploiting Basic Block Value Locality with Block Reuse

HPCA '99 Proceedings of the 5th International Symposium on High Performance Computer Architecture
Trace-Level Speculative Multithreaded Architecture

ICCD '02 Proceedings of the 2002 IEEE International Conference on Computer Design: VLSI in Computers and Processors (ICCD'02)
Trace-Level Reuse

ICPP '99 Proceedings of the 1999 International Conference on Parallel Processing
Comprehensive Evaluation of an Instruction Reissue Mechanism

ISPAN '00 Proceedings of the 2000 International Symposium on Parallel Architectures, Algorithms and Networks
Speculative Data-Driven Multithreading

HPCA '01 Proceedings of the 7th International Symposium on High-Performance Computer Architecture
Compiler Analysis for Trace-Level Speculative Multithreaded Architectures

INTERACT '05 Proceedings of the 9th Annual Workshop on Interaction between Compilers and Computer Architectures

Quantified Score

Hi-index	0.00

Visualization

Abstract

Trace-Level Speculative Multithreaded Processors exploit trace-level speculation by means of two threads working cooperatively. One thread, called the speculative thread, executes instructions ahead of the other by speculating on the result of several traces. The other thread executes speculated traces and verifies the speculation made by the first thread. Speculated traces are validated by verifying their live-output values. Every time a trace misspeculation is detected, a thread synchronization is fired. This recovery action involves flushing the pipeline and reverting to a safe point in a program, which results in some performance penalties. This paper proposes a new thread synchronization scheme based on the observation that a significant number of instructions whose control and data are independent of the mispredicted instruction. This scheme significantly increases the performance potential of the architecture at less cost. Our experimental results show that the mechanism cuts the number of executed instructions by 8% and achieves on average speed-up of almost 9% for a collection of SPEC2000 benchmarks.