Mostly parallel garbage collection
PLDI '91 Proceedings of the ACM SIGPLAN 1991 conference on Programming language design and implementation
Garbage collection: algorithms for automatic dynamic memory management
Garbage collection: algorithms for automatic dynamic memory management
Java without the coffee breaks: a nonintrusive multiprocessor garbage collector
Proceedings of the ACM SIGPLAN 2001 conference on Programming language design and implementation
A scalable mark-sweep garbage collector on large-scale shared-memory machines
SC '97 Proceedings of the 1997 ACM/IEEE conference on Supercomputing
Mostly concurrent garbage collection revisited
OOPSLA '03 Proceedings of the 18th annual ACM SIGPLAN conference on Object-oriented programing, systems, languages, and applications
An on-the-fly mark and sweep garbage collector based on sliding views
OOPSLA '03 Proceedings of the 18th annual ACM SIGPLAN conference on Object-oriented programing, systems, languages, and applications
A parallel, incremental, mostly concurrent garbage collector for servers
ACM Transactions on Programming Languages and Systems (TOPLAS)
Parallel generational-copying garbage collection with a block-structured heap
Proceedings of the 7th international symposium on Memory management
Limits of parallel marking garbage collection
Proceedings of the 7th international symposium on Memory management
A comparative evaluation of parallel garbage collector implementations
LCPC'01 Proceedings of the 14th international conference on Languages and compilers for parallel computing
Tracing garbage collection on highly parallel platforms
Proceedings of the 2010 international symposium on Memory management
Parallel memory defragmentation on a GPU
Proceedings of the 2012 ACM SIGPLAN Workshop on Memory Systems Performance and Correctness
GPUs as an opportunity for offloading garbage collection
Proceedings of the 2012 international symposium on Memory Management
Hi-index | 0.00 |
Automatic memory management makes programming easier. This is also true for general purpose GPU computing where currently no garbage collectors exist. In this paper we present a parallel mark-and-sweep collector to collect GPU memory on the GPU and tune its performance. Performance is increased by: (1) data-parallel marking and sweeping of regions of memory, (2) marking all elements of large arrays in parallel, (3) trading recursion over parallelism to match deeply linked data structures. (1) is achieved by coarsely processing all potential objects in a region of memory in parallel. When during (1) a large array is detected, it is put aside and a parallel-for is later issued on the GPU to mark its elements. For a data-structure that is a large linked list, we dynamically switch to a marking version with less overhead by performing a few recursive steps sequentially (and multiple lists in parallel). The collector achieves a speedup of a factor of up-to 11 over a sequential collector on the same GPU.