Predicting whole-program locality through reuse distance analysis
PLDI '03 Proceedings of the ACM SIGPLAN 2003 conference on Programming language design and implementation
Miss Rate Prediction across All Program Inputs
Proceedings of the 12th International Conference on Parallel Architectures and Compilation Techniques
Cross-architecture performance predictions for scientific applications using parameterized models
Proceedings of the joint international conference on Measurement and modeling of computer systems
Reuse-distance-based miss-rate prediction on a per instruction basis
MSP '04 Proceedings of the 2004 workshop on Memory system performance
Instruction Based Memory Distance Analysis and its Application
Proceedings of the 14th International Conference on Parallel Architectures and Compilation Techniques
A UPC Runtime System Based on MPI and POSIX Threads
PDP '06 Proceedings of the 14th Euromicro International Conference on Parallel, Distributed, and Network-Based Processing
A performance model for fine-grain accesses in UPC
IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
Towards a complexity model for design and analysis of PGAS-based algorithms
HPCC'07 Proceedings of the Third international conference on High Performance Computing and Communications
Hi-index | 0.00 |
Current work in high productivity parallel computing has focused attention on the class of partitioned global address space (PGAS) parallel programming languages because they promise to reduce the effort required to develop parallel application codes. An important aspect in achieving good performance in PGAS languages is effective handling of remote memory references. We extend a single-threaded reuse distance model to predict memory behavior for multi-threaded UPC applications. Our model handles changes in per-thread data size as well as changes in thread mapping due to problem size increases. Our results indicate the model provides good predictions of remote memory behavior by accurately predicting changes in remote memory reuse distance as a function of the problem size and the number of threads.