Quantifying Locality Effect in Data Access Delay: Memory logP

Authors:
Kirk W. Cameron;Xian-He Sun
Affiliations:
-;-
Venue:
IPDPS '03 Proceedings of the 17th International Symposium on Parallel and Distributed Processing
Year:
2003

Citing 0
Cited 10

Quantification of memory communication

High performance scientific and engineering computing
Predicting and Evaluating Distributed Communication Performance

Proceedings of the 2004 ACM/IEEE conference on Supercomputing
$\log_{\rm n}{\rm P}$ and $\log_{3}{\rm P}$: Accurate Analytical Models of Point-to-Point Communication in Distributed Systems

IEEE Transactions on Computers
Techniques for pipelined broadcast on ethernet switched clusters

Journal of Parallel and Distributed Computing
Modeling multigrain parallelism on heterogeneous multi-core processors: a case study of the cell BE

HiPEAC'08 Proceedings of the 3rd international conference on High performance embedded architectures and compilers
mPlogP: A Parallel Computation Model for Heterogeneous Multi-core Computer

CCGRID '10 Proceedings of the 2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing
A performance model for fine-grain accesses in UPC

IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
Performance analysis and optimization of MPI collective operations on multi-core clusters

The Journal of Supercomputing
Towards a complexity model for design and analysis of PGAS-based algorithms

HPCC'07 Proceedings of the Third international conference on High Performance Computing and Communications
Modeling communication in cache-coherent SMP systems: a case-study with Xeon Phi

Proceedings of the 22nd international symposium on High-performance parallel and distributed computing

Quantified Score

Hi-index	0.00

Visualization

Abstract

The application of hardware-parameterized models to distributed systems can result in omission of key bottlenecks such as the full cost of inter-node communication in a shared memory cluster. However, inclusion in the model of message characteristics and complex memory hierarchies may result in impractical models. Nonetheless, the growing gap betweenmemory and CPU performance combined with the trend toward large scale clustered shared memory platforms implies an increased need to consider the impact of local memory communication on parallel processing in distributed systems. We present a simple and useful model of point-to-`point memory communication to predict and analyze the latency of memory copy, pack and unpack. We use the model to isolate contributions of hardware, middleware, and software to data transfers on Intel- and MIPS-based platforms.