NUMA-Aware Java Heaps for Server Applications

Authors:
Mustafa M. Tikir;Jeffery K. Hollingsworth
Affiliations:
University of Maryland, College Park;University of Maryland, College Park
Venue:
IPDPS '05 Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Papers - Volume 01
Year:
2005

Citing 18
Cited 9

The robustness of NUMA memory management

SOSP '91 Proceedings of the thirteenth ACM symposium on Operating systems principles
Scheduling and page migration for multiprocessor compute servers

ASPLOS VI Proceedings of the sixth international conference on Architectural support for programming languages and operating systems
Operating system support for improving data locality on CC-NUMA compute servers

Proceedings of the seventh international conference on Architectural support for programming languages and operating systems
Using generational garbage collection to implement cache-conscious data placement

Proceedings of the 1st international symposium on Memory management
Cache-conscious data placement

Proceedings of the eighth international conference on Architectural support for programming languages and operating systems
Cache-conscious structure layout

Proceedings of the ACM SIGPLAN 1999 conference on Programming language design and implementation
Cache-conscious structure definition

Proceedings of the ACM SIGPLAN 1999 conference on Programming language design and implementation
Performance experiences on Sun's Wildfire prototype

SC '99 Proceedings of the 1999 ACM/IEEE conference on Supercomputing
Thread-specific heaps for multi-threaded programs

Proceedings of the 2nd international symposium on Memory management
Characterizing the memory behavior of Java workloads: a structured view and opportunities for optimizations

Proceedings of the 2001 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
Hoard: a scalable memory allocator for multithreaded applications

ASPLOS IX Proceedings of the ninth international conference on Architectural support for programming languages and operating systems
Thread-local heaps for Java

Proceedings of the 3rd international symposium on Memory management
The sun fireplane system interconnect

Proceedings of the 2001 ACM/IEEE conference on Supercomputing
Dynamic page placement to improve locality in CC-NUMA multiprocessors for TPC-C

Proceedings of the 2001 ACM/IEEE conference on Supercomputing
SMP system interconnect instrumentation for performance analysis

Proceedings of the 2002 ACM/IEEE conference on Supercomputing
Memory System Behavior of Java-Based Middleware

HPCA '03 Proceedings of the 9th International Symposium on High-Performance Computer Architecture
Using Hardware Counters to Automatically Improve Memory Performance

Proceedings of the 2004 ACM/IEEE conference on Supercomputing
An API for Runtime Code Patching

International Journal of High Performance Computing Applications

Placement optimization using data context collected during garbage collection

Proceedings of the 2009 international symposium on Memory management
Allocation wall: a limiting factor of Java applications on emerging multi-core platforms

Proceedings of the 24th ACM SIGPLAN conference on Object oriented programming systems languages and applications
NUMA-aware memory manager with dominant-thread-based copying GC

Proceedings of the 24th ACM SIGPLAN conference on Object oriented programming systems languages and applications
Memory system performance in a NUMA multicore multiprocessor

Proceedings of the 4th Annual International Conference on Systems and Storage
Assessing the scalability of garbage collectors on many cores

PLOS '11 Proceedings of the 6th Workshop on Programming Languages and Operating Systems
Assessing the scalability of garbage collectors on many cores

ACM SIGOPS Operating Systems Review
Scalable concurrent and parallel mark

Proceedings of the 2012 international symposium on Memory Management
A template library to integrate thread scheduling and locality management for NUMA multiprocessors

HotPar'12 Proceedings of the 4th USENIX conference on Hot Topics in Parallelism
A study of the scalability of stop-the-world garbage collectors on multicores

Proceedings of the eighteenth international conference on Architectural support for programming languages and operating systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

We introduce a set of techniques to both measure and optimize memory access locality of Java applications running on cc-NUMA servers.These techniques work at the object level and use information gathered from embedded hardware performance monitors.We propose a new NUMA-aware Java heap layout.In addition, we propose using dynamic object migration during garbage collection to move objects local to the processors accessing them most.Our optimization technique reduced the number of non-local memory accesses in Java workloads generated from actual runs of the SPECjbb2000 benchmark by up to 41%, and also resulted in 40% reduction in workload execution time.