Predictive performance and scalability modeling of a large-scale application
Proceedings of the 2001 ACM/IEEE conference on Supercomputing
An overview of the BlueGene/L Supercomputer
Proceedings of the 2002 ACM/IEEE conference on Supercomputing
A Performance and Scalability Analysis of the BlueGene/L Architecture
Proceedings of the 2004 ACM/IEEE conference on Supercomputing
Proceedings of the 2003 ACM/IEEE conference on Supercomputing
International Journal of High Performance Computing Applications
SC '05 Proceedings of the 2005 ACM/IEEE conference on Supercomputing
The context-switch overhead inflicted by hardware interrupts (and the enigma of do-nothing loops)
Proceedings of the 2007 workshop on Experimental computer science
The context-switch overhead inflicted by hardware interrupts (and the enigma of do-nothing loops)
ecs'07 Experimental computer science on Experimental computer science
Future generation supercomputers II: a paradigm for cluster architecture
ACM SIGARCH Computer Architecture News - Special issue: ALPS '07---advanced low power systems
Proceedings of the 2008 ACM/IEEE conference on Supercomputing
Computational forces in the SAGE benchmark
Journal of Parallel and Distributed Computing
WARPP: a toolkit for simulating high-performance parallel scientific codes
Proceedings of the 2nd International Conference on Simulation Tools and Techniques
Instruction-level simulation of a cluster at scale
Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis
Diagnosing performance bottlenecks in emerging petascale applications
Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis
Exploiting 162-Nanosecond End-to-End Communication Latency on Anton
Proceedings of the 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis
jitSim: a simulator for predicting scalability of parallel applications in presence of OS jitter
EuroPar'10 Proceedings of the 16th international Euro-Par conference on Parallel processing: Part I
International Journal of High Performance Computing Applications
Self-similarity of parallel machines
Parallel Computing
Predictive analysis of a hydrodynamics application on large-scale CMP clusters
Computer Science - Research and Development
The impact of injection bandwidth performance on application scalability
EuroMPI'11 Proceedings of the 18th European MPI Users' Group conference on Recent advances in the message passing interface
An early performance analysis of POWER7-IH HPC systems
Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis
Performance analysis of an optical circuit switched network for peta-scale systems
Euro-Par'07 Proceedings of the 13th international Euro-Par conference on Parallel Processing
Concurrency and Computation: Practice & Experience
Unified performance and power modeling of scientific workloads
E2SC '13 Proceedings of the 1st International Workshop on Energy Efficient Supercomputing
A performance comparison of current HPC systems: Blue Gene/Q, Cray XE6 and InfiniBand systems
Future Generation Computer Systems
Hi-index | 0.00 |
This work provides a performance analysis of three leading supercomputers that have recently been deployed: Purple, Red Storm and Blue Gene/L. Each of these machines are architecturally diverse, with very different performance characteristics. Each contains over 10,000 processors and has a system peak of over 40 Teraflops. We analyze each system using a range of micro-benchmarks which include communication performance as well as quantifying the impact of the operating system. The achievable application performance is compared across the systems. The application performance is confirmed via the use of detailed application models which use the underlying performance characteristics as measured by the micro-benchmarks. We also compare the machines in a realistic production scenario in which each machine is used so as to maximize its memory usage with the applications executed in a weak-scaling mode. The results also help illustrate that achievable performance is not directly related to the peak performance.