Optimizing Synchronization Operations for Remote Memory Communication Systems

  • Authors:
  • Darius Buntinas;Amina Saify;Dhabaleswar K. Panda;Jarek Nieplocha

  • Affiliations:
  • -;-;-;-

  • Venue:
  • IPDPS '03 Proceedings of the 17th International Symposium on Parallel and Distributed Processing
  • Year:
  • 2003

Quantified Score

Hi-index 0.00

Visualization

Abstract

Synchronization operations, such as fence and locking, are used in many parallel operations accessing shared memory. However, a process which is blocked waiting for a fence operation to complete, or for a lock to be acquired, cannot perform useful computation. It is therefore critical that these operations be implemented as efficiently as possible to reduce the timea process waits idle. These operations also impact the scalability of the overall system. As system sizes get larger, the number of processes potentially requesting a lock increases. In this paper we describe the design and implementation of an optimized operation which combines a global fence operation and a barrier synchronization operation. We also describe our implementation of an optimized lock algorithm. The optimizations have been incorporated into the ARMCI communication library. The global fence and barrier operation gives a factor of improvement of up to 9 over the current implementation in a 16 node system, while the optimized lock implementation gives up to 1.25 factor of improvement. These optimizations allow for more efficient and scalable applications.