Building portable thread schedulers for hierarchical multiprocessors: the bubblesched framework

Authors:
Samuel Thibault;Raymond Namyst;Pierre-André Wacrenier
Affiliations:
INRIA Futurs, LaBRI, Talence cedex, France;INRIA Futurs, LaBRI, Talence cedex, France;INRIA Futurs, LaBRI, Talence cedex, France
Venue:
Euro-Par'07 Proceedings of the 13th international Euro-Par conference on Parallel Processing
Year:
2007

Citing 6
Cited 5

Impact of Memory Contention on Dynamic Scheduling on NUMA Multiprocessors

IEEE Transactions on Parallel and Distributed Systems
PaStiX: A Parallel Sparse Direct Solver Based on a Static Scheduling for Mixed 1D/2D Block Distributions

IPDPS '00 Proceedings of the 15 IPDPS 2000 Workshops on Parallel and Distributed Processing
Lightweight reference affinity analysis

Proceedings of the 19th annual international conference on Supercomputing
Hardware profile-guided automatic page placement for ccNUMA systems

Proceedings of the eleventh ACM SIGPLAN symposium on Principles and practice of parallel programming
Operating system scheduling for chip multithreaded processors

Operating system scheduling for chip multithreaded processors
An efficient multi-level trace toolkit for multi-threaded applications

Euro-Par'05 Proceedings of the 11th international Euro-Par conference on Parallel Processing

Dynamic Task and Data Placement over NUMA Architectures: An OpenMP Runtime Perspective

IWOMP '09 Proceedings of the 5th International Workshop on OpenMP: Evolving OpenMP in an Age of Extreme Parallelism
Towards an Efficient Process Placement Policy for MPI Applications in Multicore Environments

Proceedings of the 16th European PVM/MPI Users' Group Meeting on Recent Advances in Parallel Virtual Machine and Message Passing Interface
Work-stealing for mixed-mode parallelism by deterministic team-building

Proceedings of the twenty-third annual ACM symposium on Parallelism in algorithms and architectures
Application-specific thread schedulers for internet server applications

Concurrency and Computation: Practice & Experience
Design and Implementation of Portable and Efficient Non-blocking Collective Communication

CCGRID '12 Proceedings of the 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (ccgrid 2012)

Quantified Score

Hi-index	0.00

Visualization

Abstract

Exploiting full computational power of current more and more hierarchical multiprocessor machines requires a very careful distribution of threads and data among the underlying non-uniform architecture. Unfortunately, most operating systems only provide a poor scheduling API that does not allow applications to transmit valuable scheduling hints to the system. In a previous paper [1], we showed that using a bubble-based thread scheduler can significantly improve applications' performance in a portable way. However, since multithreaded applications have various scheduling requirements, there is no universal scheduler that could meet all these needs. In this paper, we present a framework that allows scheduling experts to implement and experiment with customized thread schedulers. It provides a powerful API for dynamically distributing bubbles among the machine in a high-level, portable, and efficient way. Several examples show how experts can then develop, debug and tune their own portable bubble schedulers.