Towards an architecture-independent analysis of parallel algorithms
SIAM Journal on Computing
Active messages: a mechanism for integrated communication and computation
ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
LogP: towards a realistic model of parallel computation
PPOPP '93 Proceedings of the fourth ACM SIGPLAN symposium on Principles and practice of parallel programming
Parallel programming in Split-C
Proceedings of the 1993 ACM/IEEE conference on Supercomputing
Proceedings of the seventh annual ACM symposium on Parallel algorithms and architectures
The MIT Alewife machine: architecture and performance
ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
A comparison of architectural support for messaging in the TMC CM-5 and the Cray T3D
ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
Fast Parallel Sorting Under LogP: Experience with the CM-5
IEEE Transactions on Parallel and Distributed Systems
LoPC: modeling contention in parallel algorithms
PPOPP '97 Proceedings of the sixth ACM SIGPLAN symposium on Principles and practice of parallel programming
Effects of communication latency, overhead, and bandwidth in a cluster architecture
Proceedings of the 24th annual international symposium on Computer architecture
Limits on Interconnection Network Performance
IEEE Transactions on Parallel and Distributed Systems
On the time complexity of broadcast communication schemes (Preliminary Version)
STOC '82 Proceedings of the fourteenth annual ACM symposium on Theory of computing
Exploiting Two-Case Delivery for Fast Protected Messaging
HPCA '98 Proceedings of the 4th International Symposium on High-Performance Computer Architecture
The Sensitivity of Communication Mechanisms to Bandwidth and Latency
HPCA '98 Proceedings of the 4th International Symposium on High-Performance Computer Architecture
The Effects of Latency, Occupancy, and Bandwidth in Distributed Shared Memory Multiprocessors
The Effects of Latency, Occupancy, and Bandwidth in Distributed Shared Memory Multiprocessors
Predictive analysis of a wavefront application using LogGP
Proceedings of the seventh ACM SIGPLAN symposium on Principles and practice of parallel programming
The impact of spatial layout of jobs on parallel I/O performance
Proceedings of the sixth workshop on I/O in parallel and distributed systems
Optimal Scheduling Algorithms for Communication Constrained Parallel Processing
Euro-Par '02 Proceedings of the 8th International Euro-Par Conference on Parallel Processing
Predicting and Evaluating Distributed Communication Performance
Proceedings of the 2004 ACM/IEEE conference on Supercomputing
A performance model for fine-grain accesses in UPC
IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
Performance analysis and optimization of MPI collective operations on multi-core clusters
The Journal of Supercomputing
Bandwidth-optimal all-to-all exchanges in fat tree networks
Proceedings of the 27th international ACM conference on International conference on supercomputing
Hi-index | 0.00 |
In many real applications, for example those with frequent and irregular communication patterns or those using large messages, network contention and contention for message processing resources can be a significant part of the total execution time. This paper presents a new cost model, called LoGPC, that extends the LogP [9] and LogGP [4] models to account for the impact of network contention and network interface DMA behavior on the performance of message-passing programs.We validate LoGPC by analyzing three applications implemented with Active Messages [11, 18] on the MIT Alewife multiprocessor. Our analysis shows that network contention accounts for up to 50% of the total execution time. In addition, we show that the impact of communication locality on the communication costs is at most a factor of two on Alewife. Finally, we use the model to identify tradeoffs between synchronous and asynchronous message passing styles.