Stack Evaluation of Arbitrary Set-Associative Multiprocessor Caches
IEEE Transactions on Parallel and Distributed Systems
Hi-index | 0.00 |
Shared-memory multiprocessors commonly use shared variables for synchronization. Simulations of real parallel applications show that large-scale cache-coherent multiprocessors suffer significant amounts of invalidation traffic due to synchronization. Large multiprocessors that do not cache synchronization variables are often more severely impacted. If this synchronization traffic is not reduced or managed adequately, synchronization reference can cause severe congestion in the network. This thesis reports on data from trace-drive simulations of shared-memory multiprocessors, and proposes a class of adaptive backoff methods that do not use any extra hardware and can significantly reduce the memory traffic to synchronization variables. Our simulations show that when the number of processors participating in a barrier synchronization is small compared to the time of arrival of the processors, reductions of 20 percent to over 96 percent in synchronization traffic can be achieved at no extra cost. In other situations adaptive backoff techniques result in a tradeoff between reduced network accesses and increased processor idle time.