Inferential queueing and speculative push

  • Authors:
  • Ravi Rajwar;Alain Kägi;James R. Goodman

  • Affiliations:
  • Microarchitecture Research Lab, Intel Corporation, Hillsboro, Oregon;Department of Computer Sciences, University of Wisconsin-Madison, Madison, Wisconsin;Computer Science Department, University of Auckland, Auckland, New Zealand

  • Venue:
  • International Journal of Parallel Programming - Special issue I: The 17th annual international conference on supercomputing (ICS'03)
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

Communication latencies within critical sections constitute a major bottleneck in some classes of emerging parallel workloads. In this paper, we argue for the use of two mechanisms to reduce these communication latencies: Inferentially Queued locks (IQLs) and Speculative Push (SP). With IQLs, the processor infers the existence, and limits, of a critical section from the use of synchronization instructions and joins a queue of lock requestors, reducing synchronization delay. The SP mechanism extracts information about program structure by observing IQLs. SP allows the cache controllel, responding to a request for a cache line that likely includes a lock variable, to predict the data sets the requestor will modify within the associated critical section. The controller then pushes these lines from its own cache to the target cache, as well as writing them to memory. Overlapping the protected data transfer with that of the lock can substantially reduce the communication latencies within critical sections. By pushing data in exclusive state, the mechanism can collapse a read-modify-write sequences within a critical section into a single local cache access. The write-back to memory allows the receiving cache to ignore the push. Neither mechanism requires any programmer or compiler support nor any instruction set changes. Our experiments demonstrate that IQLs and SP can improve performance of applications employing frequent synchronization.