Performance experiences on Sun's Wildfire prototype

Authors:
Lisa Noordergraaf;Ruud van der Pas
Affiliations:
High End Server Engineering, Sun Microsystems, Burlington, MA;European HPC Team, Sun Microsystems, Geneva, Switzerland
Venue:
SC '99 Proceedings of the 1999 ACM/IEEE conference on Supercomputing
Year:
1999

Citing 17
Cited 20

The DASH prototype: implementation and performance

ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
DDM: A Cache-Only Memory Architecture

Computer
The Stanford FLASH multiprocessor

ISCA '94 Proceedings of the 21st annual international symposium on Computer architecture
Scheduling and page migration for multiprocessor compute servers

ASPLOS VI Proceedings of the sixth international conference on Architectural support for programming languages and operating systems
Informing memory operations: providing memory performance feedback in modern processors

ISCA '96 Proceedings of the 23rd annual international symposium on Computer architecture
STiNG: a CC-NUMA computer system for the commercial marketplace

ISCA '96 Proceedings of the 23rd annual international symposium on Computer architecture
Operating system support for improving data locality on CC-NUMA compute servers

Proceedings of the seventh international conference on Architectural support for programming languages and operating systems
Reactive NUMA: a design for unifying S-COMA and CC-NUMA

Proceedings of the 24th annual international symposium on Computer architecture
The SGI Origin: a ccNUMA highly scalable server

Proceedings of the 24th annual international symposium on Computer architecture
A methodology and an evaluation of the SGI Origin2000

SIGMETRICS '98/PERFORMANCE '98 Proceedings of the 1998 ACM SIGMETRICS joint international conference on Measurement and modeling of computer systems
Scaling application performance on a cache-coherent multiprocessor

ISCA '99 Proceedings of the 26th annual international symposium on Computer architecture
Measuring memory hierarchy performance of cache-coherent multiprocessors using micro benchmarks

SC '97 Proceedings of the 1997 ACM/IEEE conference on Supercomputing
Cache-Only Memory Architectures

Computer
Starfire: Extending the SMP Envelope

IEEE Micro
An argument for simple COMA

HPCA '95 Proceedings of the 1st IEEE Symposium on High-Performance Computer Architecture
WildFire: A Scalable Path for SMPs

HPCA '99 Proceedings of the 5th International Symposium on High Performance Computer Architecture
lmbench: portable tools for performance analysis

ATEC '96 Proceedings of the 1996 annual conference on USENIX Annual Technical Conference

Optimization of MPI collectives on clusters of large-scale SMP's

SC '99 Proceedings of the 1999 ACM/IEEE conference on Supercomputing
A case for user-level dynamic page migration

Proceedings of the 14th international conference on Supercomputing
Removing the overhead from software-based shared memory

Proceedings of the 2001 ACM/IEEE conference on Supercomputing
OpenMP versus MPI for PDE Solvers Based on Regular Sparse Numerical Operators

ICCS '02 Proceedings of the International Conference on Computational Science-Part III
Performance of High-Accuracy PDE Solvers on a Self-Optimizing NUMA Architecture

Euro-Par '01 Proceedings of the 7th International Euro-Par Conference Manchester on Parallel Processing
Efficient synchronization for nonuniform communication architectures

Proceedings of the 2002 ACM/IEEE conference on Supercomputing
Hierarchical Backoff Locks for Nonuniform Communication Architectures

HPCA '03 Proceedings of the 9th International Symposium on High-Performance Computer Architecture
User-Level Dynamic Page Migration for Multiprogrammed Shared-Memory Multiprocessors

ICPP '00 Proceedings of the Proceedings of the 2000 International Conference on Parallel Processing
Using Hardware Counters to Automatically Improve Memory Performance

Proceedings of the 2004 ACM/IEEE conference on Supercomputing
NUMA-Aware Java Heaps for Server Applications

IPDPS '05 Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Papers - Volume 01
affinity-on-next-touch: increasing the performance of an industrial PDE solver on a cc-NUMA system

Proceedings of the 19th annual international conference on Supercomputing
Hardware profile-guided automatic page placement for ccNUMA systems

Proceedings of the eleventh ACM SIGPLAN symposium on Principles and practice of parallel programming
OpenMP versus MPI for PDE solvers based on regular sparse numerical operators

Future Generation Computer Systems
Hardware monitors for dynamic page migration

Journal of Parallel and Distributed Computing
OpenMP versus MPI for PDE solvers based on regular sparse numerical operators

Future Generation Computer Systems
Feedback-directed page placement for ccNUMA via hardware-generated memory traces

Journal of Parallel and Distributed Computing
Affinity-on-next-touch: an extension to the Linux kernel for NUMA architectures

PPAM'09 Proceedings of the 8th international conference on Parallel processing and applied mathematics: Part I
Revisiting shared virtual memory systems for non-coherent memory-coupled cores

Proceedings of the 2012 International Workshop on Programming Models and Applications for Multicores and Manycores
Node-based memory management for scalable NUMA architectures

Proceedings of the 2nd International Workshop on Runtime and Operating Systems for Supercomputers
Scale-out NUMA

Proceedings of the 19th international conference on Architectural support for programming languages and operating systems

Quantified Score

Hi-index	0.00

Performance experiences on Sun's Wildfire prototype

Quantified Score

Visualization

Abstract