Data sharing conscious scheduling for multi-threaded applications on SMP machines

Authors:
Shlomit S. Pinter;Marcel Zalmanovici
Affiliations:
IBM Haifa Research Lab, Haifa University, Haifa, Israel;IBM Haifa Research Lab, Haifa University, Haifa, Israel
Venue:
Euro-Par'06 Proceedings of the 12th international conference on Parallel Processing
Year:
2006

Citing 10
Cited 1

Available instruction-level parallelism for superscalar and superpipelined machines

ASPLOS III Proceedings of the third international conference on Architectural support for programming languages and operating systems
The impact of operating system scheduling policies and synchronization methods of performance of parallel applications

SIGMETRICS '91 Proceedings of the 1991 ACM SIGMETRICS conference on Measurement and modeling of computer systems
The implications of cache affinity on processor scheduling for multiprogrammed, shared memory multiprocessors

SOSP '91 Proceedings of the thirteenth ACM symposium on Operating systems principles
On the equal-subset-sum problem

Information Processing Letters
Evaluating the performance of cache-affinity scheduling in shared-memory multiprocessors

Journal of Parallel and Distributed Computing
Exact and Approximate Algorithms for Scheduling Nonidentical Processors

Journal of the ACM (JACM)
Using Processor-Cache Affinity Information in Shared-Memory Multiprocessor Scheduling

IEEE Transactions on Parallel and Distributed Systems
Speeding Up Kernel Scheduler by Reducing Cache Misses

Proceedings of the FREENIX Track: 2002 USENIX Annual Technical Conference
Effects of clock resolution on the scheduling of interactive and soft real-time processes

SIGMETRICS '03 Proceedings of the 2003 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
Resource management in a decentralized system

SOSP '83 Proceedings of the ninth ACM symposium on Operating systems principles

Thread Tranquilizer: Dynamically reducing performance variation

ACM Transactions on Architecture and Code Optimization (TACO) - HIPEAC Papers

Quantified Score

Hi-index	0.00

Visualization

Abstract

Extensive use of multi-threaded applications that run on SMP mac hines, justifies modifications in thread scheduling algorithms to consider threads' characteristics in order to improve performance. Current schedulers (e.g. in Linux, AIX) avoid migrating tasks between CPUs unless absolutely necessary. Unwarranted data cache misses occur when tasks that share data run on different CPUs, or are far apart time-wise on the same CPU. This work presents an extension to the Linux scheduler that exploits inter-task data relat ions to reduce data cache misses in multi-threaded applications running on SMP platforms, thus improving runtime, memory throughput, and energy consumpt ion. Our approach schedules the tasks to the CPU that holds the relevant data rather than to the one with highest affinity. We observed improve ments in CPU time and throughput on several benchmarks. For the Chat benchmark, the improvement in CPU time and cache misses is over 30% on average.