Wire Delay is Not a Problem for SMT (In the Near Future)

  • Authors:
  • T. N. Vijaykumar;Zeshan Chishti

  • Affiliations:
  • Purdue University;Purdue University

  • Venue:
  • Proceedings of the 31st annual international symposium on Computer architecture
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

Previous papers have shown that the slow scaling of wiredelays compared to logic delays will prevent superscalar performancefrom scaling with technology.In this paper we showthat the optimal pipeline for superscalar becomes shallowerwith technology, when wire delays are considered, tighteningprevious results that deeper pipelines perform only as well asshallower pipelines.The key reason for the lack of performancescaling is that superscalar does not have sufficient parallelismto hide the relatively-increased wire delays.However,Simultaneous Multithreading (SMT) provides the much-neededparallelism.We show that an SMT running a multiprogrammedworkload with just 4-way issue not only retains theoptimal pipeline depth over technology generations, enablingat least 43% increase in clock speed every generation, but alsoachieves the remainder of the expected speedup of two pergeneration through IPC.As wire delays become more dominantin future technologies, the number of programs needs tobe scaled modestly to maintain the scaling trends, at least tillthe near-future 50nm technology.While this result ignoresbandwidth constraints, using SMT to tolerate latency due towire delays is not that simple because SMT causes bandwidthproblems.Most of the stages of a modern out-of-order-issuepipeline employ RAM and CAM structures.Wire delays in conventional,latency-optimized RAM/CAM structures preventthem from being pipelined in a scaled manner.We show thatthis limitation prevents scaling of SMT throughput.We use bitlinescaling to allow RAM/CAM bandwidth to scale with technology.Bitline scaling enables SMT throughput to scale at therate of two per technology generation in the near future.