A set of level 3 basic linear algebra subprograms
ACM Transactions on Mathematical Software (TOMS)
Public international benchmarks for parallel computers: PARKBENCH committee: Report-1
Scientific Programming
Co-array Fortran for parallel programming
ACM SIGPLAN Fortran Forum
Achieving 60 GFLOP/s on the production CFD code OverFLow-MLP
Parallel Computing - Special issue on parallel computing in aerospace
Global arrays: a portable "shared-memory" programming model for distributed memory computers
Proceedings of the 1994 ACM/IEEE conference on Supercomputing
An Eulerian gyrokinetic-Maxwell solver
Journal of Computational Physics
Performance characteristics of the Cray X1 and their implications for application performance tuning
Proceedings of the 18th annual international conference on Supercomputing
Evaluating support for global address space languages on the Cray X1
Proceedings of the 18th annual international conference on Supercomputing
Scientific Computations on Modern Parallel Vector Systems
Proceedings of the 2004 ACM/IEEE conference on Supercomputing
A Performance Model of the Parallel Ocean Program
International Journal of High Performance Computing Applications
Performance Analysis of Leading HPC Architectures With Beambeam3D
International Journal of High Performance Computing Applications
Scientific Application Performance On Leading Scalar and Vector Supercomputering Platforms
International Journal of High Performance Computing Applications
The Cray BlackWidow: a highly scalable vector multiprocessor
Proceedings of the 2007 ACM/IEEE conference on Supercomputing
Self-Optimizing Memory Controllers: A Reinforcement Learning Approach
ISCA '08 Proceedings of the 35th Annual International Symposium on Computer Architecture
Performance characteristics of a cosmology package on leading HPC architectures
HiPC'04 Proceedings of the 11th international conference on High Performance Computing
Performance and scalability analysis of cray x1 vectorization and multistreaming optimization
ICCS'05 Proceedings of the 5th international conference on Computational Science - Volume Part I
Hi-index | 0.00 |
Oak Ridge National Laboratory installed a 32 processor Cray X1 in March, 2003, and will have a 256 processor system installed by October, 2003. In this paper we describe our initial evaluation of the X1 architecture, focusing on microbenchmarks, kernels, and application codes that highlight the performance characteristics of the X1 architecture and indicate how to use the system most efficiently.