Scalable parallel trace-based performance analysis
EuroPVM/MPI'06 Proceedings of the 13th European PVM/MPI User's Group conference on Recent advances in parallel virtual machine and message passing interface
Scientific Programming - Large-Scale Programming Tools and Environments
Numerical shape optimization as an approach to extrusion die design
Finite Elements in Analysis and Design
Computer Science - Research and Development
Hi-index | 0.00 |
The xns computational fluid dynamics code was successfully running on Blue Gene/L, however, its scalability was unsatisfactory until the first Jülich BlueGene/L Scaling Workshop provided an opportunity for the application developers and performance analysts to start working together. Investigation of solver performance pin-pointed a communication bottleneck that appeared with approximately 900 processes, and subsequent remediation allowed the application to continue scaling with a four-fold simulation performance improvement at 4,096 processes. This experience also validated the scalasca performance analysis toolset, when working with a complex application at large scale, and helped direct the development of more comprehensive analyses. Performance properties have now been incorporated to automatically quantify point-to-point synchronisation time and wait states in scan operations, both of which were significant for xns on BlueGene/L.