A concurrent, generational garbage collector for a multithreaded implementation of ML
POPL '93 Proceedings of the 20th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Portable, unobtrusive garbage collection for multiprocessor systems
POPL '94 Proceedings of the 21st ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Generational stack collection and profile-driven pretenuring
PLDI '98 Proceedings of the ACM SIGPLAN 1998 conference on Programming language design and implementation
A generational on-the-fly garbage collector for Java
PLDI '00 Proceedings of the ACM SIGPLAN 2000 conference on Programming language design and implementation
On-the-fly garbage collection: an exercise in cooperation
Communications of the ACM
Thread-specific heaps for multi-threaded programs
Proceedings of the 2nd international symposium on Memory management
Cycles to recycle: garbage collection to the IA-64
Proceedings of the 2nd international symposium on Memory management
Java without the coffee breaks: a nonintrusive multiprocessor garbage collector
Proceedings of the ACM SIGPLAN 2001 conference on Programming language design and implementation
Proceedings of the 3rd international symposium on Memory management
A Language-Independent Garbage Collector Toolkit
A Language-Independent Garbage Collector Toolkit
Status report: the manticore project
ML '07 Proceedings of the 2007 workshop on Workshop on ML
A lock-free, concurrent, and incremental stack scanning for garbage collectors
Proceedings of the 2009 ACM SIGPLAN/SIGOPS international conference on Virtual execution environments
Investigating the effects of using different nursery sizing policies on performance
Proceedings of the 2009 international symposium on Memory management
Garbage collection for multicore NUMA machines
Proceedings of the 2011 ACM SIGPLAN Workshop on Memory Systems Performance and Correctness
Multicore garbage collection with local heaps
Proceedings of the international symposium on Memory management
Reducing and eliding read barriers for concurrent garbage collectors
Proceedings of the 6th Workshop on Implementation, Compilation, Optimization of Object-Oriented Languages, Programs and Systems
Reducing biased lock revocation by learning
Proceedings of the 6th Workshop on Implementation, Compilation, Optimization of Object-Oriented Languages, Programs and Systems
Memory management for many-core processors with software configurable locality policies
Proceedings of the 2012 international symposium on Memory Management
Eliminating read barriers through procrastination and cleanliness
Proceedings of the 2012 international symposium on Memory Management
A study of the scalability of stop-the-world garbage collectors on multicores
Proceedings of the eighteenth international conference on Architectural support for programming languages and operating systems
The Intel labs Haskell research compiler
Proceedings of the 2013 ACM SIGPLAN symposium on Haskell
Programming a Multicore Architecture without Coherency and Atomic Operations
Proceedings of Programming Models and Applications on Multicores and Manycores
On the limits of modeling generational garbage collector performance
Proceedings of the 5th ACM/SPEC international conference on Performance engineering
Hi-index | 0.00 |
This paper describes a garbage collector designed around the use of permanent, private, thread-local nurseries and is principally oriented towards functional languages. We try to maximize the cache hit rate by having threads continually reuse their individual private nurseries. These private nurseries operate in such a way that they can be garbage collected independently of other threads, which creates low collection pause times. Objects which survive thread-local collections are moved to a mature generation that can be collected either concurrently or in a stop-the-world fashion. We describe several optimizations (including two dynamic control parameter adaptation schemes) related to garbage collecting the private nurseries and to our concurrent collector, some of which are made possible when the language provides mutability information. We tested our collector against six benchmarks and saw single-threaded performance improvements in the range of 5-74%. We also saw a 10x increase (for 24 processors) in scalability for one parallel benchmark that had previously been memory-bound.