Optimizations in a private nursery-based garbage collector

Authors:
Todd A. Anderson
Affiliations:
Intel Corporation, Hillsboro, OR, USA
Venue:
Proceedings of the 2010 international symposium on Memory management
Year:
2010

Citing 13
Cited 10

A concurrent, generational garbage collector for a multithreaded implementation of ML

POPL '93 Proceedings of the 20th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Portable, unobtrusive garbage collection for multiprocessor systems

POPL '94 Proceedings of the 21st ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Generational stack collection and profile-driven pretenuring

PLDI '98 Proceedings of the ACM SIGPLAN 1998 conference on Programming language design and implementation
A generational on-the-fly garbage collector for Java

PLDI '00 Proceedings of the ACM SIGPLAN 2000 conference on Programming language design and implementation
On-the-fly garbage collection: an exercise in cooperation

Communications of the ACM
Thread-specific heaps for multi-threaded programs

Proceedings of the 2nd international symposium on Memory management
Cycles to recycle: garbage collection to the IA-64

Proceedings of the 2nd international symposium on Memory management
Java without the coffee breaks: a nonintrusive multiprocessor garbage collector

Proceedings of the ACM SIGPLAN 2001 conference on Programming language design and implementation
Thread-local heaps for Java

Proceedings of the 3rd international symposium on Memory management
A Language-Independent Garbage Collector Toolkit

A Language-Independent Garbage Collector Toolkit
Status report: the manticore project

ML '07 Proceedings of the 2007 workshop on Workshop on ML
A lock-free, concurrent, and incremental stack scanning for garbage collectors

Proceedings of the 2009 ACM SIGPLAN/SIGOPS international conference on Virtual execution environments
Investigating the effects of using different nursery sizing policies on performance

Proceedings of the 2009 international symposium on Memory management

Garbage collection for multicore NUMA machines

Proceedings of the 2011 ACM SIGPLAN Workshop on Memory Systems Performance and Correctness
Multicore garbage collection with local heaps

Proceedings of the international symposium on Memory management
Reducing and eliding read barriers for concurrent garbage collectors

Proceedings of the 6th Workshop on Implementation, Compilation, Optimization of Object-Oriented Languages, Programs and Systems
Reducing biased lock revocation by learning

Proceedings of the 6th Workshop on Implementation, Compilation, Optimization of Object-Oriented Languages, Programs and Systems
Memory management for many-core processors with software configurable locality policies

Proceedings of the 2012 international symposium on Memory Management
Eliminating read barriers through procrastination and cleanliness

Proceedings of the 2012 international symposium on Memory Management
A study of the scalability of stop-the-world garbage collectors on multicores

Proceedings of the eighteenth international conference on Architectural support for programming languages and operating systems
The Intel labs Haskell research compiler

Proceedings of the 2013 ACM SIGPLAN symposium on Haskell
Programming a Multicore Architecture without Coherency and Atomic Operations

Proceedings of Programming Models and Applications on Multicores and Manycores
On the limits of modeling generational garbage collector performance

Proceedings of the 5th ACM/SPEC international conference on Performance engineering

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper describes a garbage collector designed around the use of permanent, private, thread-local nurseries and is principally oriented towards functional languages. We try to maximize the cache hit rate by having threads continually reuse their individual private nurseries. These private nurseries operate in such a way that they can be garbage collected independently of other threads, which creates low collection pause times. Objects which survive thread-local collections are moved to a mature generation that can be collected either concurrently or in a stop-the-world fashion. We describe several optimizations (including two dynamic control parameter adaptation schemes) related to garbage collecting the private nurseries and to our concurrent collector, some of which are made possible when the language provides mutability information. We tested our collector against six benchmarks and saw single-threaded performance improvements in the range of 5-74%. We also saw a 10x increase (for 24 processors) in scalability for one parallel benchmark that had previously been memory-bound.