Distributed replay protocol for distributed uniprocessors

  • Authors:
  • Mengjie Mao;Hong An;Bobin Deng;Tao Sun;Xuechao Wei;Wei Zhou;Wenting Han

  • Affiliations:
  • University of Science and Technology of China, Hefei, China;University of Science and Technology of China, Hefei, China;University of Science and Technology of China, Hefei, China;University of Science and Technology of China, Hefei, China;University of Science and Technology of China, Hefei, China;University of Science and Technology of China, Hefei, China;University of Science and Technology of China, Hefei, China

  • Venue:
  • Proceedings of the 26th ACM international conference on Supercomputing
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Data speculation technique has been heavily exploited in various scenarios of architecture design. It bridges the time or space gap between data producer and data consumer, which gives opportunities to processors to gain significant speedups. However, large instruction windows, deep pipeline and increasing latency of on-chip communication make data misspeculation very expensive in modern processors. This paper proposes a Distributed Replay Protocol(DRP) that addresses data misspeculation in a distributed uniprocessor, named TFlex. The partition feature of distributed uniprocessors aggravates the penalty of data misspeculation. After detecting misspeculation, DRP avoids squashing pipeline; on the contrary, it retains all instructions in the window and selectively replays the instructions that depend on the misspeculative data. As one possible use of DRP, We apply it to recovery from data dependence speculation. We also summarize the challenges of implementing selective replay mechanism on distributed uniprocessors, and then come up with two variations of DRP to effectively solve these challenges. The evaluation results show that without data speculation, DRP achieves 99% of the performance of perfect memory disambiguation. It speeds up diverse applications over baseline TFlex(with a state-of-art data dependence predictor) by a geometric mean of 24%.