Register relocation: flexible contexts for multithreading
ISCA '93 Proceedings of the 20th annual international symposium on computer architecture
Representing control in the presence of one-shot continuations
PLDI '96 Proceedings of the ACM SIGPLAN 1996 conference on Programming language design and implementation
Whole-program optimization for time and space efficient threads
Proceedings of the seventh international conference on Architectural support for programming languages and operating systems
Simple, fast, and practical non-blocking and blocking concurrent queue algorithms
PODC '96 Proceedings of the fifteenth annual ACM symposium on Principles of distributed computing
The implementation of the Cilk-5 multithreaded language
PLDI '98 Proceedings of the ACM SIGPLAN 1998 conference on Programming language design and implementation
Lisp and Symbolic Computation
Scheduling multithreaded computations by work stealing
Journal of the ACM (JACM)
The data locality of work stealing
Proceedings of the twelfth annual ACM symposium on Parallel algorithms and architectures
Linux Journal
Design of a separable transition-diagram compiler
Communications of the ACM
Proceedings of the thirteenth annual ACM symposium on Parallel algorithms and architectures
Inter-task register-allocation for static operating systems
Proceedings of the joint conference on Languages, compilers and tools for embedded systems: software and compilers for embedded systems
Lock reservation: Java locks can mostly do without atomic operations
OOPSLA '02 Proceedings of the 17th ACM SIGPLAN conference on Object-oriented programming, systems, languages, and applications
Revised Report on the Algorithmic Language Scheme
Higher-Order and Symbolic Computation
The Named-State Register File: Implementation and Performance
HPCA '95 Proceedings of the 1st IEEE Symposium on High-Performance Computer Architecture
On the duality of operating system structures
ACM SIGOPS Operating Systems Review
Obstruction-Free Synchronization: Double-Ended Queues as an Example
ICDCS '03 Proceedings of the 23rd International Conference on Distributed Computing Systems
Capriccio: scalable threads for internet services
SOSP '03 Proceedings of the nineteenth ACM symposium on Operating systems principles
Lightweight Multitasking Support for Embedded Systems using the Phantom Serializing Compiler
Proceedings of the conference on Design, Automation and Test in Europe - Volume 2
Queue - Social Computing
Proceedings of the 43rd annual Design Automation Conference
Eliminating synchronization-related atomic operations with biased locking and bulk rebiasing
Proceedings of the 21st annual ACM SIGPLAN conference on Object-oriented programming systems, languages, and applications
Why events are a bad idea (for high-concurrency servers)
HOTOS'03 Proceedings of the 9th conference on Hot Topics in Operating Systems - Volume 9
Portable multithreading: the signal stack trick for user-space thread creation
ATEC '00 Proceedings of the annual conference on USENIX Annual Technical Conference
Lightweight concurrency primitives for GHC
Haskell '07 Proceedings of the ACM SIGPLAN workshop on Haskell workshop
Reducing Context Switch Overhead with Compiler-Assisted Threading
EUC '08 Proceedings of the 2008 IEEE/IFIP International Conference on Embedded and Ubiquitous Computing - Volume 02
Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis
Proceedings of the 19th international conference on Parallel architectures and compilation techniques
Hi-index | 0.00 |
We propose a new language-neutral primitive for the LLVM compiler, which provides efficient context switching and message passing between lightweight threads of control. The primitive, called Swapstack, can be used by any language implementation based on LLVM to build higher-level language structures such as continuations, coroutines, and lightweight threads. As part of adding the primitives to LLVM, we have also added compiler support for passing parameters across context switches. Our modified LLVM compiler produces highly efficient code through a combination of exposing the context switching code to existing compiler optimizations, and adding novel compiler optimizations to further reduce the cost of context switches. To demonstrate the generality and efficiency of our primitives, we add one-shot continuations to C++, and provide a simple fiber library that allows millions of fibers to run on multiple cores, with a work-stealing scheduler and fast inter-fiber sychronization. We argue that compiler-supported lightweight context switching can be significantly faster than using a library to switch between contexts, and provide experimental evidence to support the position.