Early evaluation of the cray XT3

Authors:
Jeffrey S. Vetter;Sadaf R. Alam;Thomas H. Dunigan, Jr.;Mark R. Fahey;Philip C. Roth;Patrick H. Worley
Affiliations:
Oak Ridge National Laboratory, Oak Ridge, TN;Oak Ridge National Laboratory, Oak Ridge, TN;Oak Ridge National Laboratory, Oak Ridge, TN;Oak Ridge National Laboratory, Oak Ridge, TN;Oak Ridge National Laboratory, Oak Ridge, TN;Oak Ridge National Laboratory, Oak Ridge, TN
Venue:
IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
Year:
2006

Citing 14
Cited 9

Fast parallel algorithms for short-range molecular dynamics

Journal of Computational Physics
PUMA: an operating system for massively parallel systems

Scientific Programming - Special issue on operating system support for massively parallel computer architectures
Synchronization and communication in the T3E multiprocessor

Proceedings of the seventh international conference on Architectural support for programming languages and operating systems
Massively parallel computing using commodity components

Parallel Computing - Parallel computing on clusters of workstations
MPI-The Complete Reference, Volume 1: The MPI Core

MPI-The Complete Reference, Volume 1: The MPI Core
A TeraFLOP Supercomputer in 1996: The ASCI TFLOP System

IPPS '96 Proceedings of the 10th International Parallel Processing Symposium
Portals 3.0: Protocol Building Blocks for Low Overhead Communication

IPDPS '02 Proceedings of the 16th International Parallel and Distributed Processing Symposium
An Eulerian gyrokinetic-Maxwell solver

Journal of Computational Physics
Cplant" Runtime System Support for Multi-Processor and Heterogeneous Compute Nodes

CLUSTER '02 Proceedings of the IEEE International Conference on Cluster Computing
Performance Evaluation of the Cray X1 Distributed Shared-Memory Architecture

IEEE Micro
Architectural specification for massively parallel computers: an experience and measurement-based approach: Research Articles

Concurrency and Computation: Practice & Experience - The High Performance Architectural Challenge: Mass Market versus Proprietary Components?
Practical performance portability in the Parallel Ocean Program (POP): Research Articles

Concurrency and Computation: Practice & Experience - The High Performance Architectural Challenge: Mass Market versus Proprietary Components?
Performance Evaluation of the SGI Altix 3700

ICPP '05 Proceedings of the 2005 International Conference on Parallel Processing
Performance characterization of molecular dynamics techniques for biomolecular simulations

Proceedings of the eleventh ACM SIGPLAN symposium on Principles and practice of parallel programming

Performance characterization of molecular dynamics techniques for biomolecular simulations

Proceedings of the eleventh ACM SIGPLAN symposium on Principles and practice of parallel programming
Scientific Application Performance On Leading Scalar and Vector Supercomputering Platforms

International Journal of High Performance Computing Applications
Cray XT4: an early evaluation for petascale scientific simulation

Proceedings of the 2007 ACM/IEEE conference on Supercomputing
A scalable, commodity data center network architecture

Proceedings of the ACM SIGCOMM 2008 conference on Data communication
Impact of Quad-Core Cray XT4 System and Software Stack on Scientific Computation

Euro-Par '09 Proceedings of the 15th International Euro-Par Conference on Parallel Processing
A 32x32x32, spatially distributed 3D FFT in four microseconds on Anton

Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis
Experimental evaluation of molecular dynamics simulations on multi-core systems

HiPC'08 Proceedings of the 15th international conference on High performance computing
Hierarchical model validation of symbolic performance models of scientific kernels

Euro-Par'06 Proceedings of the 12th international conference on Parallel Processing
The Experience in Designing and Evaluating the High Performance Cluster Netuno

International Journal of Parallel Programming

Quantified Score

Hi-index	0.00

Visualization

Abstract

Oak Ridge National Laboratory recently received delivery of a 5,294 processor Cray XT3. The XT3 is Cray's third-generation massively parallel processing system. The system builds on a single processor node-- built around the AMD Opteron--and uses a custom chip-- called SeaStar--to provide interprocessor communication. In addition, the system uses a lightweight operating system on the compute nodes. This paper describes our initial experiences with the system, including micro-benchmark, kernel, and application benchmark results. In particular, we provide performance results for strategic Department of Energy applications areas including climate and fusion. We demonstrate experiments on the installed system, scaling applications up to 4,096 processors.