Performance Analysis of k-ary n-cube Interconnection Networks
IEEE Transactions on Computers
Scalability of parallel machines
Communications of the ACM
Journal of Parallel and Distributed Computing
LimitLESS directories: A scalable cache coherence scheme
ASPLOS IV Proceedings of the fourth international conference on Architectural support for programming languages and operating systems
Proceedings of the 1990 ACM/IEEE conference on Supercomputing
A data locality optimizing algorithm
PLDI '91 Proceedings of the ACM SIGPLAN 1991 conference on Programming language design and implementation
APRIL: a processor architecture for multiprocessing
ISCA '90 Proceedings of the 17th annual international symposium on Computer Architecture
Limits on Interconnection Network Performance
IEEE Transactions on Parallel and Distributed Systems
THE MIT ALEWIFE MACHINE: A LARGE-SCALE DISTRIBUTED-MEMORY MULTIPROCESSOR
THE MIT ALEWIFE MACHINE: A LARGE-SCALE DISTRIBUTED-MEMORY MULTIPROCESSOR
THE IMPACT OF COMMUNICATION LOCALITY ON LARGE-SCALE MULTIPROCESSOR PERFORMANCE
THE IMPACT OF COMMUNICATION LOCALITY ON LARGE-SCALE MULTIPROCESSOR PERFORMANCE
Design and performance of multipath MIN architectures
SPAA '92 Proceedings of the fourth annual ACM symposium on Parallel algorithms and architectures
Closing the window of vulnerability in multiphase memory transactions
ASPLOS V Proceedings of the fifth international conference on Architectural support for programming languages and operating systems
Improving AP1000 parallel computer performance with message communication
ISCA '93 Proceedings of the 20th annual international symposium on computer architecture
LoGPC: Modeling Network Contention in Message-Passing Programs
IEEE Transactions on Parallel and Distributed Systems
Data locality sensitivity of multithreaded computations on a distributed-memory multiprocessor
CASCON '96 Proceedings of the 1996 conference of the Centre for Advanced Studies on Collaborative research
Measurement and Modeling of EARTH-MANNA Multithreaded Architecture
MASCOTS '96 Proceedings of the 4th International Workshop on Modeling, Analysis, and Simulation of Computer and Telecommunications Systems
Timed Petri net models of multithreaded multiprocessor architectures
PNPM '97 Proceedings of the 6th International Workshop on Petri Nets and Performance Models
Switch Design to Enable Predictive Multiplexed Switching in Multiprocessor Networks
IPDPS '05 Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Papers - Volume 01
Hi-index | 0.00 |
As multiprocessor sizes scale and computer architects turn to interconnection networks with non-uniform communication latencies, the lure of exploiting communication locality to increase performance become inevitable. Models that accurately quantify locality effects provide invaluable insight into the importance of exploiting locality as machine sizes and features change. This paper presents a framework for modeling the impact of communication locality on system performance. The framework provides a means for combining simple models of application, processor, and network behavior to obtain a combined model that accurately reflects feedback effects between processors and networks. We introduce a model that characterizes application behavior with three parameters that capture computation grain, sensitivity to communication latency, and amount of locality present at execution time. The combined model, we show that exploiting communication locality provides gains which are at most linear in the factor by which average communication distance is reduced when the number of outstanding communication transactions per processor is bounded. The combined model is also used to obtain rough upper bounds on the performance improvement from exploiting locality to minimize communication distance.