Effects of synchronization barriers on multiprocessor performance
Parallel Computing
Design and performance of generalized interconnection networks
Advanced computer architecture
Multiprocessor performance
Measuring parallel processor performance
Communications of the ACM
Paradigm: A Highly Scalable Shared-Memory Multicomputer Architecture
Computer - Special issue on cryptography
Hector: A Hierarchically Structured Shared-Memory Multiprocessor
Computer - Special issue on experimental research in computer architecture
Performance Measurement and Modeling to Evaluate Various Effects on a Shared memory Multiprocessor
IEEE Transactions on Software Engineering
Performance evaluation and prediction for parallel algorithms on the BBN GP1000
ICS '90 Proceedings of the 4th international conference on Supercomputing
PLUS: a distributed shared-memory system
ISCA '90 Proceedings of the 17th annual international symposium on Computer Architecture
Experimental Comparison of Memory Management Policies for NUMA Multiprocessors
Experimental Comparison of Memory Management Policies for NUMA Multiprocessors
ICS '93 Proceedings of the 7th international conference on Supercomputing
Hot spot analysis in large scale shared memory multiprocessors
Proceedings of the 1993 ACM/IEEE conference on Supercomputing
Spin-Lock Synchronization on the Butterfly and KSR1
IEEE Parallel & Distributed Technology: Systems & Technology
Performance prediction based loop scheduling for heterogeneous computing environment
SAC '97 Proceedings of the 1997 ACM symposium on Applied computing
IEEE Transactions on Parallel and Distributed Systems
Hi-index | 0.00 |
The efficiency of the basic operations of a NUMA (nonuniform memory access) multiprocessor determines the parallel processing performance on a NUMA multiprocessor. The authors present several analytical models for predicting and evaluating the overhead of interprocessor communication, process scheduling, process synchronization, and remote memory access, where network contention and memory contention are considered. Performance measurements to support the models and analyses through several numerical examples have been done on the BBN GP1000, a NUMA shared-memory multiprocessor. Analytical and experimental results give a comprehensive understanding of the various effects, which are important for the effective use of NUMA shared-memory multiprocessor. The results presented can be used to determine optimal strategies in developing an efficient programming environment for a NUMA system.