Contention-aware scheduler: unlocking execution parallelism in multithreaded java programs

Authors:
Feng Xian;Witawas Srisa-an;Hong Jiang
Affiliations:
University of Nebraska-Lincoln, Lincoln, NE, USA;University of Nebraska-Lincoln, Lincoln, NE, USA;University of Nebraska-Lincoln, Lincoln, NE, USA
Venue:
Proceedings of the 23rd ACM SIGPLAN conference on Object-oriented programming systems languages and applications
Year:
2008

Citing 25
Cited 5

Process control and scheduling issues for multiprogrammed shared-memory multiprocessors

SOSP '89 Proceedings of the twelfth ACM symposium on Operating systems principles
Scheduler activations: effective kernel support for the user-level management of parallelism

SOSP '91 Proceedings of the thirteenth ACM symposium on Operating systems principles
Transactional memory: architectural support for lock-free data structures

ISCA '93 Proceedings of the 20th annual international symposium on computer architecture
Thin locks: featherweight synchronization for Java

PLDI '98 Proceedings of the ACM SIGPLAN 1998 conference on Programming language design and implementation
First-class user-level threads

SOSP '91 Proceedings of the thirteenth ACM symposium on Operating systems principles
Neural Networks for Pattern Recognition

Neural Networks for Pattern Recognition
Speculative lock elision: enabling highly concurrent multithreaded execution

Proceedings of the 34th annual ACM/IEEE international symposium on Microarchitecture
Transactional lock-free execution of lock-based programs

Proceedings of the 10th international conference on Architectural support for programming languages and operating systems
The Transmeta Code Morphing™ Software: using speculation, recovery, and adaptive retranslation to address real-life challenges

Proceedings of the international symposium on Code generation and optimization: feedback-directed and runtime optimization
Computer Organization and Design

Computer Organization and Design
Java server performance: a case study of building efficient, scalable Jvms

IBM Systems Journal
Dynamic selection of application-specific garbage collectors

Proceedings of the 4th international symposium on Memory management
Improving virtual machine performance using a cross-run profile repository

OOPSLA '05 Proceedings of the 20th annual ACM SIGPLAN conference on Object-oriented programming, systems, languages, and applications
Optimizing memory transactions

Proceedings of the 2006 ACM SIGPLAN conference on Programming language design and implementation
Clustering the heap in multi-threaded applications for improved garbage collection

Proceedings of the 8th annual conference on Genetic and evolutionary computation
The DaCapo benchmarks: java benchmarking development and analysis

Proceedings of the 21st annual ACM SIGPLAN conference on Object-oriented programming systems, languages, and applications
Isla Vista Heap Sizing: Using Feedback to Avoid Paging

Proceedings of the International Symposium on Code Generation and Optimization
Thread clustering: sharing-aware scheduling on SMP-CMP-SMT multiprocessors

Proceedings of the 2nd ACM SIGOPS/EuroSys European Conference on Computer Systems 2007
The mechanics of in-kernel synchronization for a scalable microkernel

ACM SIGOPS Operating Systems Review
TxLinux: using and managing hardware transactional memory in an operating system

Proceedings of twenty-first ACM SIGOPS symposium on Operating systems principles
Allocation-phase aware thread scheduling policies to improve garbage collection performance

Proceedings of the 6th international symposium on Memory management
Intelligent selection of application-specific garbage collectors

Proceedings of the 6th international symposium on Memory management
CRAMM: virtual memory support for garbage-collected applications

OSDI '06 Proceedings of the 7th symposium on Operating systems design and implementation
Transactional Memory: An Overview

IEEE Micro
Operating System Concepts

Operating System Concepts

Investigating the effects of using different nursery sizing policies on performance

Proceedings of the 2009 international symposium on Memory management
Comparison of lock thrashing avoidance methods and its performance implications for lock design

Proceedings of the third international workshop on Large-scale system and application performance
SOS: saving time in dynamic race detection with stationary analysis

Proceedings of the 2011 ACM international conference on Object oriented programming systems languages and applications
Lock-contention-aware scheduler: A scalable and energy-efficient method for addressing scalability collapse on multicore systems

ACM Transactions on Architecture and Code Optimization (TACO) - Special Issue on High-Performance Embedded Architectures and Compilers
Lock contention aware thread migrations

Proceedings of the 19th ACM SIGPLAN symposium on Principles and practice of parallel programming

Quantified Score

Hi-index	0.00

Visualization

Abstract

In multithreaded programming, locks are frequently used as a mechanism for synchronization. Because today's operating systems do not consider lock usage as a scheduling criterion, scheduling decisions can be unfavorable to multithreaded applications, leading to performance issues such as convoying and heavy lock contention in systems with multiple processors. Previous efforts to address these issues (e.g., transactional memory, lock-free data structure) often treat scheduling decisions as "a fact of life," and therefore these solutions try to cope with the consequences of undesirable scheduling instead of dealing with the problem directly. In this paper, we introduce Contention-Aware Scheduler (CA-Scheduler), which is designed to support efficient execution of large multithreaded Java applications in multiprocessor systems. Our proposed scheduler employs a scheduling policy that reduces lock contention. As will be shown in this paper, our prototype implementation of the CA-Scheduler in Linux and Sun HotSpot virtual machine only incurs 3.5% runtime overhead, while the overall performance differences, when compared with a system with no contention awareness, range from a degradation of 3% in a small multithreaded benchmark to an improvement of 15% in a large Java application server benchmark.