Memory system performance of programs with intensive heap allocation

Authors:
Amer Diwan;David Tarditi;Eliot Moss
Affiliations:
Department of Computer Science, University of Massachusetts, Amherst, MA;Computer Science Department, Carnegie Mellon Umversity, 5000 Forbes Avenue, Pittsburgh, PA;Department of Computer Science, University of Massachusetts, Amherst, MA
Venue:
ACM Transactions on Computer Systems (TOCS)
Year:
1995

Citing 23
Cited 18

ORBIT: an optimizing compiler for scheme

SIGPLAN '86 Proceedings of the 1986 SIGPLAN symposium on Compiler construction
Garbage collection can be faster than stack allocation

Information Processing Letters
A Case for Direct-Mapped Caches

Computer
Simple generational garbage collection and fast allocation

Software—Practice & Experience
Continuation-passing, closure-passing style

POPL '89 Proceedings of the 16th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Evaluating Associativity in CPU Caches

IEEE Transactions on Computers
Computer architecture: a quantitative approach

Computer architecture: a quantitative approach
Cache and memory hierarchy design: a performance-directed approach

Cache and memory hierarchy design: a performance-directed approach
Representing control in the presence of first-class continuations

PLDI '90 Proceedings of the ACM SIGPLAN 1990 conference on Programming language design and implementation
A runtime system

Lisp and Symbolic Computation
Abstract execution: a technique for efficiently tracing programs

Software—Practice & Experience
Elements of functional programming

Elements of functional programming
Cache behavior of combinator graph reduction

ACM Transactions on Programming Languages and Systems (TOPLAS)
Compiling with continuations

Compiling with continuations
Page placement algorithms for large real-indexed caches

ACM Transactions on Computer Systems (TOCS)
Caching considerations for generational garbage collection

LFP '92 Proceedings of the 1992 ACM conference on LISP and functional programming
Optimally profiling and tracing programs

POPL '92 Proceedings of the 19th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
The concurrency workbench: a semantics-based tool for the verification of concurrent systems

ACM Transactions on Programming Languages and Systems (TOPLAS)
Cache write policies and performance

ISCA '93 Proceedings of the 20th annual international symposium on computer architecture
The impact of operating system structure on memory system performance

SOSP '93 Proceedings of the fourteenth ACM symposium on Operating systems principles
Cache performance of garbage-collected programming languages

Cache performance of garbage-collected programming languages
A nonrecursive list compacting algorithm

Communications of the ACM
A LISP garbage-collector for virtual-memory computer systems

Communications of the ACM

TIL: a type-directed optimizing compiler for ML

PLDI '96 Proceedings of the ACM SIGPLAN 1996 conference on Programming language design and implementation
Reconciling responsiveness with performance in pure object-oriented languages

ACM Transactions on Programming Languages and Systems (TOPLAS)
The structure and performance of interpreters

Proceedings of the seventh international conference on Architectural support for programming languages and operating systems
Cache behavior of network protocols

SIGMETRICS '97 Proceedings of the 1997 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
Research Demonstration of a Hardware Reference-Counting Heap

Lisp and Symbolic Computation
Comparing mostly-copying and mark-sweep conservative collection

Proceedings of the 1st international symposium on Memory management
Memory system behavior of Java programs: methodology and analysis

Proceedings of the 2000 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
Static load classification for improving the value predictability of data-cache misses

PLDI '02 Proceedings of the ACM SIGPLAN 2002 Conference on Programming language design and implementation
Avoiding initialization misses to the heap

ISCA '02 Proceedings of the 29th annual international symposium on Computer architecture
The cache behaviour of large lazy functional programs on stock hardware

Proceedings of the 2002 workshop on Memory system performance
TIL: a type-directed, optimizing compiler for ML

ACM SIGPLAN Notices - Best of PLDI 1979-1999
Quantifying the performance of garbage collection vs. explicit memory management

OOPSLA '05 Proceedings of the 20th annual ACM SIGPLAN conference on Object-oriented programming, systems, languages, and applications
Using Scratchpad to Exploit Object Locality in Java

ICCD '05 Proceedings of the 2005 International Conference on Computer Design
Evaluating the impact of the simulation environment on experimentation results

Performance Evaluation
Data layouts for object-oriented programs

Proceedings of the 2007 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
The transactional memory / garbage collection analogy

Proceedings of the 22nd annual ACM SIGPLAN conference on Object-oriented programming systems and applications
A cache-pinning strategy for improving generational garbage collection

HiPC'06 Proceedings of the 13th international conference on High Performance Computing
Efficient maintenance of ephemeral data

DASFAA'06 Proceedings of the 11th international conference on Database Systems for Advanced Applications

Quantified Score

Hi-index	0.00

Visualization

Abstract

Heap allocation with copying garbage collection is a general storage management technique for programming languages. It is believed to have poor memory system performance. To investigate this, we conducted an in-depth study of the memory system performance of heap allocation for memory systems found on many machines. We studied the performance of mostly functional Standard ML programs which made heavy use of heap allocation. We found that most machines support heap allocation poorly. However, with the appropriate memory system organization, heap allocation can have good performance. The memory system property crucial for achieving good performance was the ability to allocate and initialize a new object into the cache without a penalty. This can be achieved by having subblock by placement with a subblock size of one word with a write-allocate policy, along with fast page-mode writes or a write buffer. For caches with subblock placement, the data cache overhead was under 9% for a 64K or larger data cache; without subblock placement the overhead was often higher than 50%.