Garbage collection: algorithms for automatic dynamic memory management
Garbage collection: algorithms for automatic dynamic memory management
Computer architecture (2nd ed.): a quantitative approach
Computer architecture (2nd ed.): a quantitative approach
A High-Performance Memory Allocator for Object-Oriented Systems
IEEE Transactions on Computers
Dynamic Storage Allocation: A Survey and Critical Review
IWMM '95 Proceedings of the International Workshop on Memory Management
DMMX: dynamic memory management extensions
Journal of Systems and Software
Active Memory Processor: A Hardware Garbage Collector for Real-Time Java Embedded Devices
IEEE Transactions on Mobile Computing
The design and analysis of a quantitative simulator for dynamic memory management
Journal of Systems and Software
A self-maintained memory module supporting DMM
CASES '07 Proceedings of the 2007 international conference on Compilers, architecture, and synthesis for embedded systems
Memory management thread for heap allocation intensive sequential applications
Proceedings of the 10th workshop on MEmory performance: DEaling with Applications, systems and architecture
Hi-index | 0.00 |
Recent advances in software engineering, such as graphical user interfaces and object-oriented programming, have caused applications to become more memory intensive. These applications tend to allocate dynamic memory prolifically. Moreover, automatic dynamic memory reclamation (garbage collection, GC) has become a popular feature in modern programming languages. As a result, the time consumed by dynamic storage management can be up to one-third of the program execution time. This illustrates the need for a high-performance memory management scheme.This paper presents a top-level design and evaluation of the proposed instruction extensions to facilitate heap management. These instructions are h_malloc for memory allocation, mark, and sweep for garbage collection. Simulation results show that the hit ratio for 2 Kbits and 8 Kbits buffer range from 84-99% and 95-99%, respectively. The hardware complexity of the proposed scheme is O(n), where n is the size of the bit-vector. For a design with 20K gates and 97% miss rate, the overall speedup can be as high as 1.41.