The Architectural and Operating System Implications on the Performance of Synchronization on ccNUMA Multiprocessors

  • Authors:
  • Dimitrios S. Nikolopoulos;Theodore S. Papatheodorou

  • Affiliations:
  • Coordinated Science Laboratory, University of Illinois at Urbana-Champaign, 1308 West Main Str., Urbana, Illinois 61801. dsn@csrd.uiuc.edu;Department of Computer Engineering and Informatics, University of Patras, GR26500, Patras, Greece. tsp@hpclab.ceid.upatras.gr

  • Venue:
  • International Journal of Parallel Programming
  • Year:
  • 2001

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper investigates the performance of synchronization algorithms on ccNUMA multiprocessors, from the perspectives of the architecture and the operating system. In contrast with previous related studies that emphasized the relative performance of synchronization algorithms, this paper takes a new approach by analyzing the sources of synchronization latency on ccNUMA architectures and how can this latency be reduced by leveraging hardware and software schemes in both dedicated and multiprogrammed execution environments. From the architectural perspective, the paper identifies the implications of directory-based cache coherence on the latency and scalability of synchronization instructions and examines if and how can simple hardware that accelerates these instructions be leveraged to reduce synchronization latency. From the operating system's perspective, the paper evaluates in a unified framework, user-level, kernel-level and hybrid algorithms for implementing scalable synchronization in multiprogrammed execution environments. Along with visiting the aforementioned issues, the paper contributes a new methodology for implementing fast synchronization algorithms on ccNUMA multiprocessors. The relevant experiments are conducted on the SGI Origin2000, a popular commercial ccNUMA multiprocessor.