The fuzzy barrier: a mechanism for high speed synchronization of processors

  • Authors:
  • Rajiv Gupta

  • Affiliations:
  • Philips Laboratories, North American Philips Corporation, 345 Scarborough Road, Briarcliff Manor, NY

  • Venue:
  • ASPLOS III Proceedings of the third international conference on Architectural support for programming languages and operating systems
  • Year:
  • 1989

Quantified Score

Hi-index 0.00

Visualization

Abstract

Parallel programs are commonly written using barriers to synchronize parallel processes. Upon reaching a barrier, a processor must stall until all participating processors reach the barrier. A software implementation of the barrier mechanism using shared variables has two major drawbacks. Firstly, the execution of the barrier may be slow as it may not only require execution of several instructions and but also result in hot-spot accesses. Secondly, processors that are stalled waiting for other processors to reach the barrier are essentially idling and cannot do any useful work. In this paper, the notion of the fuzzy barrier is presented, that avoids the above drawbacks. The first problem is avoided by implementing the mechanism in hardware. The second problem is solved by extending the barrier concept to include a region of statements that can be executed by a processor while it awaits synchronization. The barrier regions are constructed by a compiler and consist of several instructions such that a processor is ready to synchronize upon reaching the first instruction in this region and must synchronize before exiting the region. When synchronization does occur, the processors could be executing at any point in their respective barrier regions. The larger the barrier region, the more likely it is that none of the processors will have to stall. Preliminary investigations show that barrier regions can be large and the use of program transformations can significantly increase their size. Examples of situations where such a mechanism can result in improved performance are presented. Results based on a software implementation of the fuzzy barrier on the Encore multiprocessor indicate that the synchronization overhead can be greatly reduced using the mechanism.