MULTILISP: a language for concurrent symbolic computation
ACM Transactions on Programming Languages and Systems (TOPLAS)
Distributed snapshots: determining global states of distributed systems
ACM Transactions on Computer Systems (TOCS)
Memory management with explicit regions
PLDI '98 Proceedings of the ACM SIGPLAN 1998 conference on Programming language design and implementation
Reducing cache misses using hardware and software page placement
ICS '99 Proceedings of the 13th international conference on Supercomputing
Java without the coffee breaks: a nonintrusive multiprocessor garbage collector
Proceedings of the ACM SIGPLAN 2001 conference on Programming language design and implementation
A parallel, real-time garbage collector
Proceedings of the ACM SIGPLAN 2001 conference on Programming language design and implementation
Hoard: a scalable memory allocator for multithreaded applications
ASPLOS IX Proceedings of the ninth international conference on Architectural support for programming languages and operating systems
A scalable mark-sweep garbage collector on large-scale shared-memory machines
SC '97 Proceedings of the 1997 ACM/IEEE conference on Supercomputing
A parallel, incremental and concurrent GC for servers
PLDI '02 Proceedings of the ACM SIGPLAN 2002 Conference on Programming language design and implementation
Starting with termination: a methodology for building distributed garbage collection algorithms
ACSC '01 Proceedings of the 24th Australasian conference on Computer science
A parallel java grande benchmark suite
Proceedings of the 2001 ACM/IEEE conference on Supercomputing
Creating and preserving locality of java applications at allocation and garbage collection times
OOPSLA '02 Proceedings of the 17th ACM SIGPLAN conference on Object-oriented programming, systems, languages, and applications
Evaluation of Parallel Copying Garbage Collection on a Shared-Memory Multiprocessor
IEEE Transactions on Parallel and Distributed Systems
Data Flow Analysis for Software Prefetching Linked Data Structures in Java
Proceedings of the 2001 International Conference on Parallel Architectures and Compilation Techniques
The nofib Benchmark Suite of Haskell Programs
Proceedings of the 1992 Glasgow Workshop on Functional Programming
An efficient parallel heap compaction algorithm
OOPSLA '04 Proceedings of the 19th annual ACM SIGPLAN conference on Object-oriented programming, systems, languages, and applications
Dynamic selection of application-specific garbage collectors
Proceedings of the 4th international symposium on Memory management
Proceedings of the 1st ACM/USENIX international conference on Virtual execution environments
The Compressor: concurrent, incremental, and parallel compaction
Proceedings of the 2006 ACM SIGPLAN conference on Programming language design and implementation
Parallel garbage collection for shared memory multiprocessors
JVM'01 Proceedings of the 2001 Symposium on JavaTM Virtual Machine Research and Technology Symposium - Volume 1
Cell GC: using the cell synergistic processor as a garbage collection coprocessor
Proceedings of the fourth ACM SIGPLAN/SIGOPS international conference on Virtual execution environments
Proceedings of the 2008 ACM SIGPLAN conference on Programming language design and implementation
Parallel generational-copying garbage collection with a block-structured heap
Proceedings of the 7th international symposium on Memory management
A Fully Parallel LISP2 Compactor with Preservation of the Sliding Properties
Languages and Compilers for Parallel Computing
A new approach to parallelising tracing algorithms
Proceedings of the 2009 international symposium on Memory management
Reactive NUCA: near-optimal block placement and replication in distributed caches
Proceedings of the 36th annual international symposium on Computer architecture
Hosting an object heap on manycore hardware: an exploration
DLS '09 Proceedings of the 5th symposium on Dynamic languages
A comparative evaluation of parallel garbage collector implementations
LCPC'01 Proceedings of the 14th international conference on Languages and compilers for parallel computing
Tracing garbage collection on highly parallel platforms
Proceedings of the 2010 international symposium on Memory management
Optimizations in a private nursery-based garbage collector
Proceedings of the 2010 international symposium on Memory management
C4: the continuously concurrent compacting collector
Proceedings of the international symposium on Memory management
A study of the scalability of stop-the-world garbage collectors on multicores
Proceedings of the eighteenth international conference on Architectural support for programming languages and operating systems
Traffic management: a holistic approach to memory placement on NUMA systems
Proceedings of the eighteenth international conference on Architectural support for programming languages and operating systems
Hi-index | 0.00 |
As processors evolve towards higher core counts, architects will develop more sophisticated memory systems to satisfy the cores' increasing thirst for memory bandwidth. Early many-core processor designs suggest that future memory systems will likely include multiple controllers and distributed cache coherence protocols. Many-core processors that expose memory locality policies to the software system provide opportunities for automatic tuning that can achieve significant performance benefits. Managed languages typically provide a simple heap abstraction. This paper presents techniques that bridge the gap between the simple heap abstraction of modern languages and the complicated memory systems of future processors. We present a NUMA-aware approach to garbage collection that balances the competing concerns of data locality and heap utilization to improve performance. We combine a lightweight approach for measuring an application's memory behavior with an online, adaptive algorithm for tuning the cache to optimize it for the specific application's behaviors. We have implemented our garbage collector and cache tuning algorithm and present results on a 64-core TILEPro64 processor.