CHARM++: a portable concurrent object oriented system based on C++
OOPSLA '93 Proceedings of the eighth annual conference on Object-oriented programming systems, languages, and applications
Proceedings of the 2006 ACM/IEEE conference on Supercomputing
Cray XT4: an early evaluation for petascale scientific simulation
Proceedings of the 2007 ACM/IEEE conference on Supercomputing
Overview of the IBM Blue Gene/P project
IBM Journal of Research and Development
Early evaluation of IBM BlueGene/P
Proceedings of the 2008 ACM/IEEE conference on Supercomputing
Architecture of the Component Collective Messaging Interface
Proceedings of the 15th European PVM/MPI Users' Group Meeting on Recent Advances in Parallel Virtual Machine and Message Passing Interface
Roofline: an insightful visual performance model for multicore architectures
Communications of the ACM - A Direct Path to Dependable Software
IBM System Blue Gene Solution: Blue Gene/P Application Development
IBM System Blue Gene Solution: Blue Gene/P Application Development
An evaluative study on the effect of contention on message latencies in large supercomputers
IPDPS '09 Proceedings of the 2009 IEEE International Symposium on Parallel&Distributed Processing
Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis
Adapting MPI to MapReduce PaaS Clouds: An Experiment in Cross-Paradigm Execution
UCC '12 Proceedings of the 2012 IEEE/ACM Fifth International Conference on Utility and Cloud Computing
Unified performance and power modeling of scientific workloads
E2SC '13 Proceedings of the 1st International Workshop on Energy Efficient Supercomputing
A performance comparison of current HPC systems: Blue Gene/Q, Cray XE6 and InfiniBand systems
Future Generation Computer Systems
Hi-index | 0.00 |
The emergence of new parallel architectures presents new challenges for application developers. Supercomputers vary in processor speed, network topology, interconnect communication characteristics and memory subsystems. This paper presents a performance comparison of three of the fastest machines in the world: IBMâ聙聶s Blue Gene/P installation at ANL (Intrepid), the SUN-Infiniband cluster at TACC (Ranger) and Crayâ聙聶s XT4 installation at ORNL (Jaguar). Comparisons are based on three applications selected by NSF for the Track 1 proposal to benchmark the Blue Waters system: NAMD, MILC and a turbulence code, DNS. We present a comprehensive overview of the architectural details of each of these machines and a comparison of their basic performance parameters. Application performance is presented for multiple problem sizes and the relative performance on the selected machines is explained through micro-benchmarking results. We hope that insights from this work will be useful to managers making buying decisions for supercomputers and application users trying to decide on a machine to run on. Based on the performance analysis techniques used in the paper, we also suggest a step-by-step procedure for estimating the suitability of a given architecture for a highly parallel application.