Communications of the ACM
Speedup Versus Efficiency in Parallel Systems
IEEE Transactions on Computers
Measuring parallel processor performance
Communications of the ACM
The DASH prototype: implementation and performance
ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
The network architecture of the Connection Machine CM-5 (extended abstract)
SPAA '92 Proceedings of the fourth annual ACM symposium on Parallel algorithms and architectures
Performance analysis of MR-1, a clustered shared-memory multiprocessor
Journal of Parallel and Distributed Computing
Advanced Computer Architecture: Parallelism,Scalability,Programmability
Advanced Computer Architecture: Parallelism,Scalability,Programmability
Scalable Shared-Memory Multiprocessing
Scalable Shared-Memory Multiprocessing
Cost-Effective Parallel Computing
Computer
Performance Analysis of Cluster-Based Multiprocessors
IEEE Transactions on Computers
High-performance computer architecture and algorithm simulator
Journal on Educational Resources in Computing (JERIC)
Hi-index | 14.98 |
A model of system performance for parallel processing on clustered multiprocessors is developed which unifies multiprogramming with speedup and scaled-speedup. The model is used to explore processor to process allocation alternatives for executing a workload consisting of multiple processes. Heuristics are developed that relate cluster size to parallel fraction of a program and to process scaling factors. The basic analytical model is made more sophisticated by incorporating considerations that affect the realizable speedup, including explicit process scaling, Degree of Parallelism (DOP) as a discrete function, and communication overhead. New developments incorporate nonuniform workload, interconnection network probability of acceptance of requests, nonuniform memory access, and multithreaded processes.