GRAPE-4: a one-Tflops special-purpose computer for astrophysical N-body problem
Proceedings of the 1994 ACM/IEEE conference on Supercomputing
GPFS: A Shared-Disk File System for Large Computing Clusters
FAST '02 Proceedings of the Conference on File and Storage Technologies
The Soft Error Problem: An Architectural Perspective
HPCA '05 Proceedings of the 11th International Symposium on High-Performance Computer Architecture
Improving the Scalability of Parallel Jobs by adding Parallel Awareness to the Operating System
Proceedings of the 2003 ACM/IEEE conference on Supercomputing
Proceedings of the 2003 ACM/IEEE conference on Supercomputing
The impact of noise on the scaling of collectives: a theoretical approach
HiPC'05 Proceedings of the 12th international conference on High Performance Computing
ZOID: I/O-forwarding infrastructure for petascale architectures
Proceedings of the 13th ACM SIGPLAN Symposium on Principles and practice of parallel programming
DMTracker: finding bugs in large-scale parallel programs by detecting anomaly in data movements
Proceedings of the 2007 ACM/IEEE conference on Supercomputing
Implementation and performance analysis of non-blocking collective operations for MPI
Proceedings of the 2007 ACM/IEEE conference on Supercomputing
Evaluating the effect of replacing CNK with linux on the compute-nodes of blue gene/l
Proceedings of the 22nd annual international conference on Supercomputing
jitSim: a simulator for predicting scalability of parallel applications in presence of OS jitter
EuroPar'10 Proceedings of the 16th international Euro-Par conference on Parallel processing: Part I
Performance and Scalability Evaluation of 'Big Memory' on Blue Gene Linux
International Journal of High Performance Computing Applications
Extending and benchmarking the "Big Memory" implementation on Blue Gene/P Linux
Proceedings of the 1st International Workshop on Runtime and Operating Systems for Supercomputers
Visual analysis of I/O system behavior for high-end computing
Proceedings of the third international workshop on Large-scale system and application performance
Better than native: using virtualization to improve compute node performance
Proceedings of the 2nd International Workshop on Runtime and Operating Systems for Supercomputers
Improving compute node performance using virtualization
International Journal of High Performance Computing Applications
Hi-index | 0.00 |
Petascale supercomputers will be available by 2008. The largest machine of these complex leadership-class machines will probably have nearly 250K CPUs. These massively parallel systems have a number of challenging operating system issues. In this paper, we focus on the issues most important for the system that will first breach the petaflop barrier: synchronization and collective operations, parallel I/O, and fault tolerance.