Memory access buffering in multiprocessors
ISCA '86 Proceedings of the 13th annual international symposium on Computer architecture
The Wisconsin multicube: a new large-scale cache-coherent multiprocessor
ISCA '88 Proceedings of the 15th Annual International Symposium on Computer architecture
Measuring parallel processor performance
Communications of the ACM
Tolerating latency through software-controlled prefetching in shared-memory multiprocessors
Journal of Parallel and Distributed Computing - Special issue on shared-memory multiprocessors
Performance Measurement and Modeling to Evaluate Various Effects on a Shared memory Multiprocessor
IEEE Transactions on Software Engineering
Performance Prediction and Evaluation of Parallel Processing on a NUMA Multiprocessor
IEEE Transactions on Software Engineering
Cache Invalidation Patterns in Shared-Memory Multiprocessors
IEEE Transactions on Computers
Benchmark workload generation and performance characterization of multiprocessors
Proceedings of the 1992 ACM/IEEE conference on Supercomputing
Sparse matrix computations: implications for cache designs
Proceedings of the 1992 ACM/IEEE conference on Supercomputing
The KSR1: experimentation and modeling of poststore
SIGMETRICS '93 Proceedings of the 1993 ACM SIGMETRICS conference on Measurement and modeling of computer systems
Performance evaluation and prediction for parallel algorithms on the BBN GP1000
ICS '90 Proceedings of the 4th international conference on Supercomputing
Experimental Comparison of Memory Management Policies for NUMA Multiprocessors
Experimental Comparison of Memory Management Policies for NUMA Multiprocessors
Micro benchmark analysis of the KSR1
Proceedings of the 1993 ACM/IEEE conference on Supercomputing
Communication in the KSR1 MPP: performance evaluation using synthetic workload experiments
ICS '94 Proceedings of the 8th international conference on Supercomputing
Data and program restructuring of irregular applications for cache-coherent multiprocessor
ICS '94 Proceedings of the 8th international conference on Supercomputing
Performance evaluation model for multicomputer systems
Neural, Parallel & Scientific Computations
IPPS '97 Proceedings of the 11th International Symposium on Parallel Processing
Hi-index | 0.00 |
Communication has a dominant impact on the performance of massively parallel processors (MPPs). We propose a methodology to evaluate the internode communication performance of MPPs using a controlled set of synthetic workloads. By generating a range of sparse matrices and measuring the performance of a simple parallel algorithm that repeatedly multiplies a sparse matrix by a dense vector, we can determine the relative performance of different communication workloads. Specifiable communication parameters include the number of nodes, the average amount of communication per node, the degree of sharing among the nodes, and the computation-communication ratio. We describe a general procedure for constructing sparse matrices that have these desired communication and computation parameters, and apply a range of these synthetic workloads to evaluate the hierarchical ring interconnection and cache-only memory architecture (COMA) of the Kendall Square Research KSRI MPP. This analysis discusses the impact of the KSRI architecture on communication performance, highlighting the utility and impact of the automatic update feature. It also investigates the impact of system contention on the performance, particularly how it causes potential updates to be ignored.