The implications of cache affinity on processor scheduling for multiprogrammed, shared memory multiprocessors

Authors:
Raj Vaswani;John Zahorjan
Affiliations:
Department of Computer Science and Engineering, University of Washington, Seattle, WA;Department of Computer Science and Engineering, University of Washington, Seattle, WA
Venue:
SOSP '91 Proceedings of the thirteenth ACM symposium on Operating systems principles
Year:
1991

Citing 12
Cited 40

Footprints in the cache

ACM Transactions on Computer Systems (TOCS)
Solving problems on concurrent processors. Vol. 1: General techniques and regular problems

Solving problems on concurrent processors. Vol. 1: General techniques and regular problems
PRESTO: a system for object-oriented parallel programming

Software—Practice & Experience
Process control and scheduling issues for multiprogrammed shared-memory multiprocessors

SOSP '89 Proceedings of the twelfth ACM symposium on Operating systems principles
Organization and performance of a two-level virtual-real cache hierarchy

ISCA '89 Proceedings of the 16th annual international symposium on Computer architecture
The cache performance and optimizations of blocked algorithms

ASPLOS IV Proceedings of the fourth international conference on Architectural support for programming languages and operating systems
The effect of context switches on cache performance

ASPLOS IV Proceedings of the fourth international conference on Architectural support for programming languages and operating systems
The interaction of architecture and operating system design

ASPLOS IV Proceedings of the fourth international conference on Architectural support for programming languages and operating systems
The impact of operating system scheduling policies and synchronization methods of performance of parallel applications

SIGMETRICS '91 Proceedings of the 1991 ACM SIGMETRICS conference on Measurement and modeling of computer systems
Empirical studies of competitve spinning for a shared-memory multiprocessor

SOSP '91 Proceedings of the thirteenth ACM symposium on Operating systems principles
Improving direct-mapped cache performance by the addition of a small fully-associative cache and prefetch buffers

ISCA '90 Proceedings of the 17th annual international symposium on Computer Architecture
Using Processor-Cache Affinity Information in Shared-Memory Multiprocessor Scheduling

IEEE Transactions on Parallel and Distributed Systems

Characterizing the caching and synchronization performance of a multiprocessor operating system

ASPLOS V Proceedings of the fifth international conference on Architectural support for programming languages and operating systems
A dynamic processor allocation policy for multiprogrammed shared-memory multiprocessors

ACM Transactions on Computer Systems (TOCS)
A machine independent interface for lightweight threads

ACM SIGOPS Operating Systems Review
Benefits of cache-affinity scheduling in shared-memory multiprocessors: a summary

SIGMETRICS '93 Proceedings of the 1993 ACM SIGMETRICS conference on Measurement and modeling of computer systems
Processor allocation policies for message-passing parallel computers

SIGMETRICS '94 Proceedings of the 1994 ACM SIGMETRICS conference on Measurement and modeling of computer systems
Impact of sharing-based thread placement on multithreaded architectures

ISCA '94 Proceedings of the 21st annual international symposium on Computer architecture
Scheduling and page migration for multiprocessor compute servers

ASPLOS VI Proceedings of the sixth international conference on Architectural support for programming languages and operating systems
A Hierarchical Task Queue Organization for Shared-Memory Multiprocessor Systems

IEEE Transactions on Parallel and Distributed Systems
The effectiveness of affinity-based scheduling in multiprocessor network protocol processing (extended version)

IEEE/ACM Transactions on Networking (TON)
Operating system support for improving data locality on CC-NUMA compute servers

Proceedings of the seventh international conference on Architectural support for programming languages and operating systems
Using name-based mappings to increase hit rates

IEEE/ACM Transactions on Networking (TON)
Flexible use of memory for replication/migration in cache-coherent DSM multiprocessors

Proceedings of the 25th annual international symposium on Computer architecture
PSCR: A Coherence Protocol for Eliminating Passive Sharing in Shared-Bus Shared-Memory Multiprocessors

IEEE Transactions on Parallel and Distributed Systems
Adaptive two-level thread management for fast MPI execution on shared memory machines

SC '99 Proceedings of the 1999 ACM/IEEE conference on Supercomputing
Symbiotic jobscheduling for a simultaneous mutlithreading processor

ACM SIGPLAN Notices
Process migration

ACM Computing Surveys (CSUR)
The effect of seance communication on multiprocessing systems

ACM Transactions on Computer Systems (TOCS)
Symbiotic jobscheduling for a simultaneous multithreaded processor

ASPLOS IX Proceedings of the ninth international conference on Architectural support for programming languages and operating systems
Anticipatory scheduling: a disk scheduling framework to overcome deceptive idleness in synchronous I/O

SOSP '01 Proceedings of the eighteenth ACM symposium on Operating systems principles
Symbiotic jobscheduling with priorities for a simultaneous multithreading processor

SIGMETRICS '02 Proceedings of the 2002 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
Affinity scheduling of unbalanced workloads

Proceedings of the 1994 ACM/IEEE conference on Supercomputing
Using Processor Affinity in Loop Scheduling on Shared-Memory Multiprocessors

IEEE Transactions on Parallel and Distributed Systems
A Comparison of Concurrent Programming and Cooperative Multithreading

Euro-Par '00 Proceedings from the 6th International Euro-Par Conference on Parallel Processing
Predictive scheduling of network processors

Computer Networks: The International Journal of Computer and Telecommunications Networking - Network processors
Scheduling Algorithms for Effective Thread Pairing on Hybrid Multiprocessors

IPDPS '05 Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Papers - Volume 01
Memory and Network Bandwidth Aware Scheduling of Multiprogrammed Workloads on Clusters of SMPs

ICPADS '06 Proceedings of the 12th International Conference on Parallel and Distributed Systems - Volume 1
User-guided symbiotic space-sharing of real workloads

Proceedings of the 20th annual international conference on Supercomputing
Efficient self-tuning spin-locks using competitive analysis

Journal of Systems and Software
Surplus fair scheduling: a proportional-share CPU scheduling algorithm for symmetric multiprocessors

OSDI'00 Proceedings of the 4th conference on Symposium on Operating System Design & Implementation - Volume 4
HACC: an architecture for cluster-based web servers

WINSYM'99 Proceedings of the 3rd conference on USENIX Windows NT Symposium - Volume 3
Experience distributing objects in an SMMP OS

ACM Transactions on Computer Systems (TOCS)
On the importance of parallel application placement in NUMA multiprocessors

Sedms'93 USENIX Systems on USENIX Experiences with Distributed and Multiprocessor Systems - Volume 4
Load balancing on speed

Proceedings of the 15th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming
Symbiotic space-sharing on SDSC's datastar system

JSSPP'06 Proceedings of the 12th international conference on Job scheduling strategies for parallel processing
Dynamic load balancing in MPI jobs

ISHPC'05/ALPS'06 Proceedings of the 6th international symposium on high-performance computing and 1st international conference on Advanced low power systems
The effectiveness of affinity-based scheduling in multiprocessor networking

INFOCOM'96 Proceedings of the Fifteenth annual joint conference of the IEEE computer and communications societies conference on The conference on computer communications - Volume 1
An analysis of Linux scalability to many cores

OSDI'10 Proceedings of the 9th USENIX conference on Operating systems design and implementation
Data sharing conscious scheduling for multi-threaded applications on SMP machines

Euro-Par'06 Proceedings of the 12th international conference on Parallel Processing
Realistic workload scheduling policies for taming the memory bandwidth bottleneck of SMPs

HiPC'04 Proceedings of the 11th international conference on High Performance Computing
Uncovering CPU load balancing policies with harmony

Proceedings of the ACM International Conference on Computing Frontiers

Quantified Score

Hi-index	0.00

Visualization

Abstract

In a shared memory multiprocessor with caches, executing tasks develop "affinity" to processors by filling their caches with data and instructions during execution. A scheduling policy that ignores this affinity may waste processing power by causing excessive cache refilling.Our work focuses on quantifying the effect of processor reallocation on the performance of various parallel applications multiprogrammed on a shared memory multiprocessor, and on evaluating how the magnitude of this cost affects the choice of scheduling policy.We first identify the components of application response time, including processor reallocation costs. Next, we measure the impact of reallocation on the cache behavior of several parallel applications executing on a Sequent Symmetry multiprocessor. We also measure, the performance of these applications under a number of alternative allocation policies. These experiments lead us to conclude that on current machines processor affinity has only a very weak influence on the choice of scheduling discipline, and that the benefits of frequent processor reallocation (in response to the changing parallelism of jobs) outweigh the penalties imposed by such reallocation. Finally, we use this experimental data to parameterize a simple analytic model, allowing us to evaluate the effect of processor affinity on future machines, those containing faster processors and larger caches.