MULTILISP: a language for concurrent symbolic computation
ACM Transactions on Programming Languages and Systems (TOPLAS)
Garbage collection in an uncooperative environment
Software—Practice & Experience
A garbage collection algorithm for shared memory parallel processors
International Journal of Parallel Programming
A space-efficient parallel garbage compaction algorithm
ICS '91 Proceedings of the 5th international conference on Supercomputing
Proceedings of the 14th ACM SIGPLAN conference on Object-oriented programming, systems, languages, and applications
Adaptive optimization in the Jalapeño JVM
OOPSLA '00 Proceedings of the 15th ACM SIGPLAN conference on Object-oriented programming, systems, languages, and applications
The case for profile-directed selection of garbage collectors
Proceedings of the 2nd international symposium on Memory management
A generational mostly-concurrent garbage collector
Proceedings of the 2nd international symposium on Memory management
A scalable mark-sweep garbage collector on large-scale shared-memory machines
SC '97 Proceedings of the 1997 ACM/IEEE conference on Supercomputing
Lock-Free Garbage Collection for Multiprocessors
IEEE Transactions on Parallel and Distributed Systems
IBM Systems Journal
Parallel garbage collection for shared memory multiprocessors
JVM'01 Proceedings of the 2001 Symposium on JavaTM Virtual Machine Research and Technology Symposium - Volume 1
On the usefulness of type and liveness accuracy for garbage collection and leak detection
ACM Transactions on Programming Languages and Systems (TOPLAS)
High-Performance Scalable Java Virtual Machines
HiPC '01 Proceedings of the 8th International Conference on High Performance Computing
Ulterior reference counting: fast garbage collection without a long wait
OOPSLA '03 Proceedings of the 18th annual ACM SIGPLAN conference on Object-oriented programing, systems, languages, and applications
Myths and realities: the performance impact of garbage collection
Proceedings of the joint international conference on Measurement and modeling of computer systems
Dynamic selection of application-specific garbage collectors
Proceedings of the 4th international symposium on Memory management
Improving locality with parallel hierarchical copying GC
Proceedings of the 5th international symposium on Memory management
Controlling garbage collection and heap growth to reduce the execution time of Java applications
ACM Transactions on Programming Languages and Systems (TOPLAS)
Proceedings of the 2008 ACM SIGPLAN conference on Programming language design and implementation
Parallel generational-copying garbage collection with a block-structured heap
Proceedings of the 7th international symposium on Memory management
A new approach to parallelising tracing algorithms
Proceedings of the 2009 international symposium on Memory management
Concurrent, parallel, real-time garbage-collection
Proceedings of the 2010 international symposium on Memory management
Iterative data-parallel mark&sweep on a GPU
Proceedings of the international symposium on Memory management
Parallel memory defragmentation on a GPU
Proceedings of the 2012 ACM SIGPLAN Workshop on Memory Systems Performance and Correctness
Memory management for many-core processors with software configurable locality policies
Proceedings of the 2012 international symposium on Memory Management
Hi-index | 0.00 |
While uniprocessor garbage collection is relatively well understood, experience with collectors for large multiprocessor servers is limited and it is unknown which techniques best scale with large memories and large numbers of processors. In order to explore these issues we designed a modular garbage collection framework in the IBM Jalapeño Java virtual machine and implemented five different parallel garbage collectors: non-generational and generational versions of mark-and-sweep and semi-space copying collectors, as well as a hybrid of the two. We describe the optimizations necessary to achieve good performance across all of the collectors, including load balancing, fast synchronization, and inter-processor sharing of free lists. We then quantitatively compare the different collectors to find their asymptotic performance both with respect to how fast they can run applications as well as how little memory they can run them in. All of our collectors scale linearly up to sixteen processors. The least memory is usually required by the hybrid mark-sweep collector that uses a copying collector for its nursery, although sometimes the non-generational mark-sweep collector requires less memory. The fastest execution is more application-dependent. Our only application with a large working set performed best using the mark-sweep collector; with one exception, the rest of the applications ran fastest with one of the generational collectors.