SMTp: An Architecture for Next-generation Scalable Multi-threading

  • Authors:
  • Mainak Chaudhuri;Mark Heinrich

  • Affiliations:
  • Cornell University, Ithaca, NY;University of Central Florida

  • Venue:
  • Proceedings of the 31st annual international symposium on Computer architecture
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

We introduce the SMTp architecture-an SMT processoraugmented with a coherence protocol thread context,that together with a standard integrated memory controllercan enable the design of (among other possibilities) scalablecache-coherent hardware distributed shared memory(DSM) machines from commodity nodes. We describe theminor changes needed to a conventional out-of-order multi-threadedcore to realize SMTp, discussing issues related toboth deadlock avoidance and performance. We then compareSMTp performance to that of various conventionalDSM machines with normal SMT processors both with andwithout integrated memory controllers. On configurationsfrom 1 to 32 nodes, with 1 to 4 application threads pernode, we find that SMTp delivers performance comparableto, and sometimes better than, machines with more complexintegrated DSM-specific memory controllers. Our resultsalso show that the protocol thread has extremely lowpipeline overhead. Given the simplicity and the flexibility ofthe SMTp mechanism, we argue that next-generation multi-threadedprocessors with integrated memory controllersshould adopt this mechanism as a way of building less complexhigh-performance DSM multiprocessors.