Communication complexity of PRAMs
Theoretical Computer Science - Special issue: Fifteenth international colloquium on automata, languages and programming, Tampere, Finland, July 1988
A bridging model for parallel computation
Communications of the ACM
Proceedings of the seventh annual ACM symposium on Parallel algorithms and architectures
Optimizing memory system performance for communication in parallel computers
ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
LogP: a practical model of parallel computation
Communications of the ACM
LoPC: modeling contention in parallel algorithms
PPOPP '97 Proceedings of the sixth ACM SIGPLAN symposium on Principles and practice of parallel programming
LoGPC: modeling network contention in message-passing programs
SIGMETRICS '98/PERFORMANCE '98 Proceedings of the 1998 ACM SIGMETRICS joint international conference on Measurement and modeling of computer systems
Predictive analysis of a wavefront application using LogGP
Proceedings of the seventh ACM SIGPLAN symposium on Principles and practice of parallel programming
Memory Hierarchy Considerations for Cost-Effective Cluster Computing
IEEE Transactions on Computers
LogGPS: a parallel computational model for synchronization analysis
PPoPP '01 Proceedings of the eighth ACM SIGPLAN symposium on Principles and practices of parallel programming
Optimizing noncontiguous accesses in MPI – IO
Parallel Computing
Parallel Computer Architecture: A Hardware/Software Approach
Parallel Computer Architecture: A Hardware/Software Approach
Portability and Performance for Parallel Processing
Portability and Performance for Parallel Processing
Fast Measurement of LogP Parameters for Message Passing Platforms
IPDPS '00 Proceedings of the 15 IPDPS 2000 Workshops on Parallel and Distributed Processing
Parallelism in random access machines
STOC '78 Proceedings of the tenth annual ACM symposium on Theory of computing
Abstracting network characteristics and locality properties of parallel systems
HPCA '95 Proceedings of the 1st IEEE Symposium on High-Performance Computer Architecture
Quantifying Locality Effect in Data Access Delay: Memory logP
IPDPS '03 Proceedings of the 17th International Symposium on Parallel and Distributed Processing
Techniques for pipelined broadcast on ethernet switched clusters
Journal of Parallel and Distributed Computing
Runtime optimization of vector operations on large scale SMP clusters
Proceedings of the 17th international conference on Parallel architectures and compilation techniques
mPlogP: A Parallel Computation Model for Heterogeneous Multi-core Computer
CCGRID '10 Proceedings of the 2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing
Performance analysis and optimization of MPI collective operations on multi-core clusters
The Journal of Supercomputing
Modeling communication in cache-coherent SMP systems: a case-study with Xeon Phi
Proceedings of the 22nd international symposium on High-performance parallel and distributed computing
Hi-index | 14.98 |
Many existing models of point-to-point communication in distributed systems ignore the impact of memory and middleware. Including such details may make these models impractical. Nonetheless, the growing gap between memory and CPU performance combined with the trend toward large-scale, clustered shared memory platforms implies an increased need to consider the impact of middleware on distributed communication. We present a general software-parameterized model of point-to-point communication for use in performance prediction and evaluation. We illustrate the utility of the model in three ways: 1) to derive a simplified, useful, more accurate model of point-to-point communication in clusters of SMPs, 2) to predict and analyze point-to-point and broadcast communication costs in clusters of SMPs, and 3) to express, compare, and contrast existing communication models. Though our methods are general, we present results on several Linux clusters to illustrate practical use on real systems.