A performance evaluation of the cray x1 for scientific applications

Authors:
Leonid Oliker;Rupak Biswas;Julian Borrill;Andrew Canning;Jonathan Carter;M. Jahed Djomehri;Hongzhang Shan;David Skinner
Affiliations:
CRD/NERSC, Lawrence Berkeley National Laboratory, Berkeley, CA;NAS Division, NASA Ames Research Center, Moffett Field, CA;CRD/NERSC, Lawrence Berkeley National Laboratory, Berkeley, CA;CRD/NERSC, Lawrence Berkeley National Laboratory, Berkeley, CA;CRD/NERSC, Lawrence Berkeley National Laboratory, Berkeley, CA;NAS Division, NASA Ames Research Center, Moffett Field, CA;CRD/NERSC, Lawrence Berkeley National Laboratory, Berkeley, CA;CRD/NERSC, Lawrence Berkeley National Laboratory, Berkeley, CA
Venue:
VECPAR'04 Proceedings of the 6th international conference on High Performance Computing for Computational Science
Year:
2004

Citing 3
Cited 7

The SPLASH-2 programs: characterization and methodological considerations

ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
A 26.58 Tflops global atmospheric simulation with the spectral transform method on the Earth Simulator

Proceedings of the 2002 ACM/IEEE conference on Supercomputing
Evaluation of Cache-based Superscalar and Cacheless Vector Architectures for Scientific Computations

Proceedings of the 2003 ACM/IEEE conference on Supercomputing

Performance characteristics of the Cray X1 and their implications for application performance tuning

Proceedings of the 18th annual international conference on Supercomputing
Scientific Computations on Modern Parallel Vector Systems

Proceedings of the 2004 ACM/IEEE conference on Supercomputing
The potential of the cell processor for scientific computing

Proceedings of the 3rd conference on Computing frontiers
Scientific computing Kernels on the cell processor

International Journal of Parallel Programming
Optimization of a lattice Boltzmann computation on state-of-the-art multicore platforms

Journal of Parallel and Distributed Computing
Performance tuning and analysis of future vector processors based on the roofline model

Proceedings of the 10th workshop on MEmory performance: DEaling with Applications, systems and architecture
Performance characteristics of a cosmology package on leading HPC architectures

HiPC'04 Proceedings of the 11th international conference on High Performance Computing

Quantified Score

Hi-index	0.00

Visualization

Abstract

The last decade has witnessed a rapid proliferation of superscalar cache-based microprocessors to build high-end capability and capacity computers primarily because of their generality, scalability, and cost effectiveness. However, the recent development of massively parallel vector systems is having a significant effect on the supercomputing landscape. In this paper, we compare the performance of the recently-released Cray X1 vector system with that of the cacheless NEC SX-6 vector machine, and the superscalar cache-based IBM Power3 and Power4 architectures for scientific applications. Overall results demonstrate that the X1 is quite promising, but performance improvements are expected as the hardware, systems software, and numerical libraries mature. Code reengineering to effectively utilize the complex architecture may also lead to significant efficiency enhancements.