LogP: towards a realistic model of parallel computation
PPOPP '93 Proceedings of the fourth ACM SIGPLAN symposium on Principles and practice of parallel programming
Introduction to parallel computing: design and analysis of algorithms
Introduction to parallel computing: design and analysis of algorithms
Optimal broadcast and summation in the LogP model
SPAA '93 Proceedings of the fifth annual ACM symposium on Parallel algorithms and architectures
An architecture for optimal all-to-all personalized communication
SPAA '94 Proceedings of the sixth annual ACM symposium on Parallel algorithms and architectures
Proceedings of the eighth annual ACM symposium on Parallel algorithms and architectures
Journal of Computational Physics
Selected Results from the ParkBench Benchmark
Euro-Par '96 Proceedings of the Second International Euro-Par Conference on Parallel Processing-Volume II
Performance Analysis of Wavefront Algorithms on Very-Large Scale Distributed Systems
Workshop on Wide Area Networks and High Performance Computing
A parallel particle-in-cell model for beam-beam interaction in high energy ring colliders
Journal of Computational Physics
Integrated Performance Monitoring of a Cosmology Application on Leading HEC Platforms
ICPP '05 Proceedings of the 2005 International Conference on Parallel Processing
Apex-Map: A Global Data Access Benchmark to Analyze HPC Systems and Parallel Programming Paradigms
SC '05 Proceedings of the 2005 ACM/IEEE conference on Supercomputing
Tera-Scalable Algorithms for Variable-Density Elliptic Hydrodynamics with Spectral Accuracy
SC '05 Proceedings of the 2005 ACM/IEEE conference on Supercomputing
Modeling application performance by convolving machine signatures with application profiles
WWC '01 Proceedings of the Workload Characterization, 2001. WWC-4. 2001 IEEE International Workshop
A performance model of non-deterministic particle transport on large-scale systems
Future Generation Computer Systems
Performance modeling: understanding the past and predicting the future
Euro-Par'05 Proceedings of the 11th international Euro-Par conference on Parallel Processing
Bridging performance analysis tools and analytic performance modeling for HPC
Euro-Par 2010 Proceedings of the 2010 conference on Parallel processing
Hi-index | 0.00 |
An accurate modeling of the beam-beam interaction is essential to maximizing the luminosity in existing and future colliders. BeamBeam3D was the first parallel code that can be used to study this interaction fully self-consistently on high-performance computing platforms. Various all-to-all personalized communication (AAPC) algorithms dominate its communication patterns, for which we developed a sequence of performance models using a series of micro-benchmarks. We find that for SMP based systems the most important performance constraint is node-adapter contention, while for 3D-Torus topologies good performance models are not possible without considering link contention. The best average model prediction error is very low on SMP based systems with of 3% to 7%. On torus based systems errors of 29% are higher but optimized performance can again be predicted within 8% in some cases. These excellent results across five different systems indicate that this methodology for performance modeling can be applied to a large class of algorithms.1