A Cost-Effective Main Memory Organization for Future Servers

Authors:
Magnus Ekman;Per Stenstrom
Affiliations:
Chalmers University of Technology, Sweden;Chalmers University of Technology, Sweden
Venue:
IPDPS '05 Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Papers - Volume 01
Year:
2005

Citing 14
Cited 7

Evaluation of memory system extensions

ISCA '91 Proceedings of the 18th annual international symposium on Computer architecture
The case for SRAM main memory

ACM SIGARCH Computer Architecture News
Efficient management of memory hierarchies in embedded DRAM systems

ICS '99 Proceedings of the 13th international conference on Supercomputing
A fully associative software-managed cache design

Proceedings of the 27th annual international symposium on Computer architecture
Solaris internals: core kernel architecture

Solaris internals: core kernel architecture
Power aware page allocation

ASPLOS IX Proceedings of the ninth international conference on Architectural support for programming languages and operating systems
An adaptive, non-uniform cache structure for wire-delay dominated on-chip caches

Proceedings of the 10th international conference on Architectural support for programming languages and operating systems
WildFire: A Scalable Path for SMPs

HPCA '99 Proceedings of the 5th International Symposium on High Performance Computer Architecture
Performance of Hardware Compressed Main Memory

HPCA '01 Proceedings of the 7th International Symposium on High-Performance Computer Architecture
Compressed caching and modern virtual memory simulation

Compressed caching and modern virtual memory simulation
Gbit/s lossless data compression hardware

IEEE Transactions on Very Large Scale Integration (VLSI) Systems - Special section on the 2001 international conference on computer design (ICCD)
SimICS/sun4m: a virtual workstation

ATEC '98 Proceedings of the annual conference on USENIX Annual Technical Conference
The case for compressed caching in virtual memory systems

ATEC '99 Proceedings of the annual conference on USENIX Annual Technical Conference
Evaluation techniques for storage hierarchies

IBM Systems Journal

FlashCache: a NAND flash memory file cache for low power web servers

CASES '06 Proceedings of the 2006 international conference on Compilers, architecture and synthesis for embedded systems
Understanding and Designing New Server Architectures for Emerging Warehouse-Computing Environments

ISCA '08 Proceedings of the 35th Annual International Symposium on Computer Architecture
Implementing high availability memory with a duplication cache

Proceedings of the 41st annual IEEE/ACM International Symposium on Microarchitecture
Scalable high performance main memory system using phase-change memory technology

Proceedings of the 36th annual international symposium on Computer architecture
Disaggregated memory for expansion and sharing in blade servers

Proceedings of the 36th annual international symposium on Computer architecture
Adaptive memory system over ethernet

HotStorage'10 Proceedings of the 2nd USENIX conference on Hot topics in storage and file systems
Simple but Effective Heterogeneous Main Memory with On-Chip Memory Controller Support

Proceedings of the 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis

Quantified Score

Hi-index	0.00

Visualization

Abstract

Today, the amount of main memory in mid-range servers is pushing practical limits with as much as 192 GB memory in a 24 processor system. Further, with the onset of multi-threaded, multi-core processor chips, it is likely that the number of memory chips per processor chip will start to increase, making DRAM cost and size an even larger burden. We investigate in this paper an alternative main memory organization - a two-level noninclusive memory hierarchy - where the second level is substantially slower than the first level, with the aim of reducing total system cost and spatial requirements of servers of today and the future. We quantitatively investigate how big and how slow the second level can be. Surprisingly, we find that only 30% of the entire memory resources typically needed must be accessed at DRAM speed whereas the rest can be accessed at a speed that is an order of magnitude slower with a negligible (1.2% on average) performance impact. We also present a cost-effective implementation of how to manage such a hierarchy and how it can bring down memory cost by leveraging memory compression and sharing of memory resources among servers.