Effective “static-graph” reorganization to improve locality in garbage-collected systems
PLDI '91 Proceedings of the ACM SIGPLAN 1991 conference on Programming language design and implementation
Garbage collection: algorithms for automatic dynamic memory management
Garbage collection: algorithms for automatic dynamic memory management
The measured cost of copying garbage collection mechanisms
ICFP '97 Proceedings of the second ACM SIGPLAN international conference on Functional programming
Adaptive optimization in the Jalapeño JVM
OOPSLA '00 Proceedings of the 15th ACM SIGPLAN conference on Object-oriented programming, systems, languages, and applications
Reducing garbage collector cache misses
Proceedings of the 2nd international symposium on Memory management
Recursive functions of symbolic expressions and their computation by machine, Part I
Communications of the ACM
Dynamic Storage Allocation: A Survey and Critical Review
IWMM '95 Proceedings of the International Workshop on Memory Management
Myths and realities: the performance impact of garbage collection
Proceedings of the joint international conference on Measurement and modeling of computer systems
IBM Systems Journal
Software prefetching for mark-sweep garbage collection: hardware analysis and software redesign
ASPLOS XI Proceedings of the 11th international conference on Architectural support for programming languages and operating systems
The garbage collection advantage: improving program locality
OOPSLA '04 Proceedings of the 19th annual ACM SIGPLAN conference on Object-oriented programming, systems, languages, and applications
The DaCapo benchmarks: java benchmarking development and analysis
Proceedings of the 21st annual ACM SIGPLAN conference on Object-oriented programming systems, languages, and applications
Cell GC: using the cell synergistic processor as a garbage collection coprocessor
Proceedings of the fourth ACM SIGPLAN/SIGOPS international conference on Virtual execution environments
Demystifying magic: high-level low-level programming
Proceedings of the 2009 ACM SIGPLAN/SIGOPS international conference on Virtual execution environments
The locality of concurrent write barriers
Proceedings of the 2010 international symposium on Memory management
A comprehensive evaluation of object scanning techniques
Proceedings of the international symposium on Memory management
Scalable concurrent and parallel mark
Proceedings of the 2012 international symposium on Memory Management
Proceedings of the 2013 ACM SIGPLAN international conference on Object oriented programming systems languages & applications
Hi-index | 0.00 |
Garbage collection is a performance-critical feature of most modern object oriented languages, and is characterized by poor locality since it must traverse the heap. In this paperwe show that by combining two very simple ideas wecan significantly improve the performance of the canonical mark-sweep collector, resulting in improvements in application performance. We make three main contributions: 1) we develop a methodology and framework for accurately and deterministically analyzing the tracing loop at the heart ofthe collector, 2) we offer a number of insights and improvements over conventional design choices for mark-sweep collectors, and 3) we find that two simple ideas: edge order traversal and software prefetch. combine to greatly improve garbage collection performance although each is unproductive in isolation. We perform a thorough analysis in the context of MMTk and Jikes RVM on a wide range of benchmarks and four different architectures. Our baseline system (which includes a number of our improvements) is very competitive with highly tuned alternatives. We show a simple marking mechanism which offers modest but consistent improvements over conventional choices. Finally, we show that enqueuing the edges pointers) of the object graph rather than the nodes (objects) significantly increases opportunities for software prefetch, despite increasing the total number of queue operations. Combining edge ordered enqueuing with software prefetching yields average performance improvements over a large suite of benchmarks of 20-30% in garbage collection time and 4-6% of total application performance in moderate heaps, across four architectures.