Paging tradeoffs in distributed-shared-memory multiprocessors

Authors:
Douglas C. Burger;Rahmat S. Hyder;Barton P. Miller;David A. Wood
Affiliations:
University of Wisconsin-Madison, Madison, WI;University of Wisconsin-Madison, Madison, WI;University of Wisconsin-Madison, Madison, WI;University of Wisconsin-Madison, Madison, WI
Venue:
Proceedings of the 1994 ACM/IEEE conference on Supercomputing
Year:
1994

Citing 9
Cited 12

Footprints in the cache

ACM Transactions on Computer Systems (TOCS)
An evaluation of directory schemes for cache coherence

ISCA '88 Proceedings of the 15th Annual International Symposium on Computer architecture
Speedup Versus Efficiency in Parallel Systems

IEEE Transactions on Computers
SPLASH: Stanford parallel applications for shared-memory

ACM SIGARCH Computer Architecture News
The Wisconsin Wind Tunnel: virtual prototyping of parallel computers

SIGMETRICS '93 Proceedings of the 1993 ACM SIGMETRICS conference on Measurement and modeling of computer systems
Distributed computing feasibility in a non-dedicated homogeneous distributed system

Proceedings of the 1993 ACM/IEEE conference on Supercomputing
Medusa: an experiment in distributed operating system structure

Communications of the ACM
The working set model for program behavior

Communications of the ACM
Issues in multiprogrammed multiprocessor scheduling

Issues in multiprogrammed multiprocessor scheduling

Scheduling memory constrained jobs on distributed memory parallel computers

Proceedings of the 1995 ACM SIGMETRICS joint international conference on Measurement and modeling of computer systems
Coordinated allocation of memory and processors in multiprocessors

Proceedings of the 1996 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
Implicit coscheduling: coordinated scheduling with implicit information in distributed systems

ACM Transactions on Computer Systems (TOCS)
Adaptive Scheduling under Memory Pressure on Multiprogrammed SMPs

IPDPS '02 Proceedings of the 16th International Parallel and Distributed Processing Symposium
Memory Management Techniques for Gang Scheduling

Euro-Par '00 Proceedings from the 6th International Euro-Par Conference on Parallel Processing
Adjusting the Lengths of Time Slices when Scheduling PVM Jobs with High Memory Requirements

Proceedings of the 9th European PVM/MPI Users' Group Meeting on Recent Advances in Parallel Virtual Machine and Message Passing Interface
Parallel Job Scheduling: A Performance Perspective

Performance Evaluation: Origins and Directions
Adaptive scheduling under memory constraints on non-dedicated computational farms

Future Generation Computer Systems - Selected papers from CCGRID 2002
Improving the Scalability of Parallel Jobs by adding Parallel Awareness to the Operating System

Proceedings of the 2003 ACM/IEEE conference on Supercomputing
The Case of the Missing Supercomputer Performance: Achieving Optimal Performance on the 8,192 Processors of ASCI Q

Proceedings of the 2003 ACM/IEEE conference on Supercomputing
A distributed paging RAM grid system for wide-area memory sharing

IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
On paged distributed virtual memory algorithms in a broadcasting environment

Computer Communications

Quantified Score

Hi-index	0.00

Visualization

Abstract

Massively parallel processors have begun using commodity operating systems that support demand-paged virtual memory. To evaluate the utility of virtual memory, we measured the behavior of seven shared-memory parallel application programs on a simulated distributed-shared-memory machine. Our results (i) confirm the importance of gang CPU scheduling, (ii) show that a page-faulting processor should spin rather than invoke a parallel context switch, (iii) show that our parallel programs frequently touch most of their data, and (iv) indicate that memory, not just CPUs, must be "gang scheduled". Overall, our experiments demonstrate that demand paging has limited value on current parallel machines because of the applications' synchronization and memory reference patterns and the machines' high page-fault and parallel-context-switch overheads.