IEEE Micro
PicoServer: using 3D stacking technology to enable a compact energy efficient chip multiprocessor
Proceedings of the 12th international conference on Architectural support for programming languages and operating systems
FlashCache: a NAND flash memory file cache for low power web servers
CASES '06 Proceedings of the 2006 international conference on Compilers, architecture and synthesis for embedded systems
JouleSort: a balanced energy-efficiency benchmark
Proceedings of the 2007 ACM SIGMOD international conference on Management of data
MapReduce: simplified data processing on large clusters
OSDI'04 Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - Volume 6
Energy efficient near-threshold chip multi-processing
ISLPED '07 Proceedings of the 2007 international symposium on Low power electronics and design
Understanding and Designing New Server Architectures for Emerging Warehouse-Computing Environments
ISCA '08 Proceedings of the 35th Annual International Symposium on Computer Architecture
Improving NAND Flash Based Disk Caches
ISCA '08 Proceedings of the 35th Annual International Symposium on Computer Architecture
Large-Scale Parallel Collaborative Filtering for the Netflix Prize
AAIM '08 Proceedings of the 4th international conference on Algorithmic Aspects in Information and Management
Architecting phase change memory as a scalable dram alternative
Proceedings of the 36th annual international symposium on Computer architecture
A durable and energy efficient main memory using phase change memory technology
Proceedings of the 36th annual international symposium on Computer architecture
Scalable high performance main memory system using phase-change memory technology
Proceedings of the 36th annual international symposium on Computer architecture
FAWN: a fast array of wimpy nodes
Proceedings of the ACM SIGOPS 22nd symposium on Operating systems principles
Better I/O through byte-addressable, persistent memory
Proceedings of the ACM SIGOPS 22nd symposium on Operating systems principles
PACT '09 Proceedings of the 2009 18th International Conference on Parallel Architectures and Compilation Techniques
Enhancing lifetime and security of PCM-based main memory with start-gap wear leveling
Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture
Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture
Formalizing MapReduce with CSP
ECBS '10 Proceedings of the 2010 17th IEEE International Conference and Workshops on the Engineering of Computer-Based Systems
Energy proportional datacenter networks
Proceedings of the 37th annual international symposium on Computer architecture
Mnemosyne: lightweight persistent memory
Proceedings of the sixteenth international conference on Architectural support for programming languages and operating systems
NV-Heaps: making persistent objects fast and safe with next-generation, non-volatile memories
Proceedings of the sixteenth international conference on Architectural support for programming languages and operating systems
Consistent and durable data structures for non-volatile byte-addressable memory
FAST'11 Proceedings of the 9th USENIX conference on File and stroage technologies
Communications of the ACM
Integrated 3D-stacked server designs for increasing physical density of key-value stores
Proceedings of the 19th international conference on Architectural support for programming languages and operating systems
NVM duet: unified working memory and persistent store architecture
Proceedings of the 19th international conference on Architectural support for programming languages and operating systems
Hi-index | 0.00 |
The adoption of non-volatile memories (NVMs) in system architecture and the growth in data-centric workloads offer exciting opportunities for new designs. In this paper, we examine the potential and limit of designs that move compute in close proximity to NVM-based data stores. To address the challenges in evaluating such system architectures for distributed systems, we develop and validate a new methodology for large-scale data-centric workloads. We then study "nanostores" as an example design that constructs distributed systems from building blocks with 3D-stacked compute and NVM layers on the same chip, replacing both traditional storage and memory with NVM. Our limits study demonstrates significant potential of this approach (3-162X improvement in energy delay product) over 2015 baselines, particularly for IO-intensive workloads. We also discuss and quantify the impact of network bandwidth, software scalability, and power density, and design tradeoffs for future NVM-based data-centric architectures.