A parallel, incremental, mostly concurrent garbage collector for servers

  • Authors:
  • Katherine Barabash;Ori Ben-Yitzhak;Irit Goft;Elliot K. Kolodner;Victor Leikehman;Yoav Ossia;Avi Owshanko;Erez Petrank

  • Affiliations:
  • IBM Haifa Research Lab, Haifa, Israel;IBM Haifa Research Lab, Haifa, Israel;IBM Haifa Research Lab, Haifa, Israel;IBM Haifa Research Lab, Haifa, Israel;IBM Haifa Research Lab, Haifa, Israel;IBM Haifa Research Lab, Haifa, Israel;IBM Haifa Research Lab, Haifa, Israel;Technion, Israel

  • Venue:
  • ACM Transactions on Programming Languages and Systems (TOPLAS)
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

Multithreaded applications with multigigabyte heaps running on modern servers provide new challenges for garbage collection (GC). The challenges for “server-oriented” GC include: ensuring short pause times on a multigigabyte heap while minimizing throughput penalty, good scaling on multiprocessor hardware, and keeping the number of expensive multicycle fence instructions required by weak ordering to a minimum.We designed and implemented a collector facing these demands building on the mostly concurrent garbage collector proposed by Boehm et al. [1991]. Our collector incorporates new ideas into the original collector. We make it parallel and incremental; we employ concurrent low-priority background GC threads to take advantage of processor idle time; we propose novel algorithmic improvements to the basic mostly concurrent algorithm improving its efficiency and shortening its pause times; and finally, we use advanced techniques, such as a low-overhead work packet mechanism to enable full parallelism among the incremental and concurrent collecting threads and ensure load balancing.We compared the new collector to the mature, well-optimized, parallel, stop-the-world mark-sweep collector already in the IBM JVM. When allowed to run aggressively, using 72% of the CPU utilization during a short concurrent phase, our collector prototype reduces the maximum pause time from 161 ms to 46 ms while only losing 11.5% throughput when running the SPECjbb2000 benchmark on a 600-MB heap on an 8-way PowerPC 1.1-GHz processors. When the collector is limited to a nonintrusive operation using only 29% of the CPU utilization, the maximum pause time obtained is 79 ms and the loss in throughput is 15.4%.