A scalable mark-sweep garbage collector on large-scale shared-memory machines

Authors:
Toshio Endo;Kenjiro Taura;Akinori Yonezawa
Affiliations:
The University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo 113, Japan;The University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo 113, Japan;The University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo 113, Japan
Venue:
SC '97 Proceedings of the 1997 ACM/IEEE conference on Supercomputing
Year:
1997

Citing 16
Cited 25

MULTILISP: a language for concurrent symbolic computation

ACM Transactions on Programming Languages and Systems (TOPLAS)
Garbage collection in an uncooperative environment

Software—Practice & Experience
Garbage collection in MultiScheme

Proceedings of the US/Japan workshop on Parallel Lisp on Parallel Lisp: languages and systems
Space efficient conservative garbage collection

PLDI '93 Proceedings of the ACM SIGPLAN 1993 conference on Programming language design and implementation
A concurrent copying garbage collector for languages that distinguish (im)mutable data

PPOPP '93 Proceedings of the fourth ACM SIGPLAN symposium on Principles and practice of parallel programming
A concurrent, generational garbage collector for a multithreaded implementation of ML

POPL '93 Proceedings of the 20th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
A methodology for implementing highly concurrent data objects

ACM Transactions on Programming Languages and Systems (TOPLAS)
Concurrent replicating garbage collection

LFP '94 Proceedings of the 1994 ACM conference on LISP and functional programming
Remarks on A methodology for implementing highly concurrent data

ACM SIGPLAN Notices
Notes on “A methodology for implementing highly concurrent data objects”

ACM Transactions on Programming Languages and Systems (TOPLAS)
Garbage collection: algorithms for automatic dynamic memory management

Garbage collection: algorithms for automatic dynamic memory management
An effective garbage collection strategy for parallel programming languages on large scale distributed-memory machines

PPOPP '97 Proceedings of the sixth ACM SIGPLAN symposium on Principles and practice of parallel programming
Lock-Free Garbage Collection for Multiprocessors

IEEE Transactions on Parallel and Distributed Systems
Evaluation of Parallel Copying Garbage Collection on a Shared-Memory Multiprocessor

IEEE Transactions on Parallel and Distributed Systems
Uniprocessor Garbage Collection Techniques

IWMM '92 Proceedings of the International Workshop on Memory Management
ICC++-AC++ Dialect for High Performance Parallel Computing

ISOTAS '96 Proceedings of the Second JSSST International Symposium on Object Technologies for Advanced Software

On bounding time and space for multiprocessor garbage collection

Proceedings of the ACM SIGPLAN 1999 conference on Programming language design and implementation
An on-the-fly reference counting garbage collector for Java

OOPSLA '01 Proceedings of the 16th ACM SIGPLAN conference on Object-oriented programming, systems, languages, and applications
Reducing pause time of conservative collectors

Proceedings of the 3rd international symposium on Memory management
A parallel, incremental and concurrent GC for servers

PLDI '02 Proceedings of the ACM SIGPLAN 2002 Conference on Programming language design and implementation
Predicting Scalability of Parallel Garbage Collectors on Shared Memory Multiprocessors

IPDPS '01 Proceedings of the 15th International Parallel & Distributed Processing Symposium
On bounding time and space for multiprocessor garbage collection

ACM SIGPLAN Notices - Best of PLDI 1979-1999
Garbage-first garbage collection

Proceedings of the 4th international symposium on Memory management
A parallel, incremental, mostly concurrent garbage collector for servers

ACM Transactions on Programming Languages and Systems (TOPLAS)
An on-the-fly reference-counting garbage collector for java

ACM Transactions on Programming Languages and Systems (TOPLAS)
Improving locality with parallel hierarchical copying GC

Proceedings of the 5th international symposium on Memory management
Lock-free parallel and concurrent garbage collection by mark&sweep

Science of Computer Programming
An efficient on-the-fly cycle collection

ACM Transactions on Programming Languages and Systems (TOPLAS)
Parallel generational-copying garbage collection with a block-structured heap

Proceedings of the 7th international symposium on Memory management
Space-and-time efficient garbage collectors for parallel systems

Proceedings of the 6th ACM conference on Computing frontiers
A new approach to parallelising tracing algorithms

Proceedings of the 2009 international symposium on Memory management
A comparative evaluation of parallel garbage collector implementations

LCPC'01 Proceedings of the 14th international conference on Languages and compilers for parallel computing
Tracing garbage collection on highly parallel platforms

Proceedings of the 2010 international symposium on Memory management
Iterative data-parallel mark&sweep on a GPU

Proceedings of the international symposium on Memory management
Lock-Free parallel garbage collection

ISPA'05 Proceedings of the Third international conference on Parallel and Distributed Processing and Applications
Achieving middleware execution efficiency: hardware-assisted garbage collection operations

The Journal of Supercomputing
Age-Oriented concurrent garbage collection

CC'05 Proceedings of the 14th international conference on Compiler Construction
An efficient on-the-fly cycle collection

CC'05 Proceedings of the 14th international conference on Compiler Construction
A localized tracing scheme applied to garbage collection

APLAS'06 Proceedings of the 4th Asian conference on Programming Languages and Systems
Memory management for many-core processors with software configurable locality policies

Proceedings of the 2012 international symposium on Memory Management
A study of data structures with a deep heap shape

Proceedings of the ACM SIGPLAN Workshop on Memory Systems Performance and Correctness

Quantified Score

Hi-index	0.00

Visualization

Abstract

This work describes implementation of a mark-sweep garbage collector (GC) for shared-memory machines and reports its performance. It is a simple ''parallel'' collector in which all processors cooperatively traverse objects in the global shared heap. The collector stops the application program during a collection and assumes a uniform access cost to all locations in the shared heap. Implementation is based on the Boehm-Demers-Weiser conservative GC (Boehm GC). Experiments have been done on Ultra Enterprise 10000 (Ultra Sparc processor 250 MHz, 64 processors). We wrote two applications, BH (an N-body problem solver) and CKY (a context free grammar parser) in a parallel extension to C++.Through the experiments, We observe that load balancing is the key to achieving scalability. A naive collector without load redistribution hardly exhibits speed-up (at most fourfold speed-up on 64 processors). Performance can be improved by dynamic load balancing, which exchanges objects to be scanned between processors, but we still observe that straightforward implementation severely limits performance. First, large objects become a source of significant load imbalance, because the unit of load redistribution is a single object. Performance is improved by splitting a large object into small pieces before pushing it onto the mark stack. Next, processors spend a significant amount of time uselessly because of serializing method for termination detection using a shared counter. This problem suddenly appeared on more than 32 processors. By implementing non-serializing method for termination detection, the idle time is eliminated and performance is improved. With all these careful implementation, we achieved average speed-up of 28.0 in BH and 28.6 in CKY on 64 processors.