Implementing fault-tolerant services using the state machine approach: a tutorial
ACM Computing Surveys (CSUR)
Concurrent compacting garbage collection of a persistent heap
SOSP '93 Proceedings of the fourteenth ACM symposium on Operating systems principles
eNVy: a non-volatile, main memory storage system
ASPLOS VI Proceedings of the sixth international conference on Architectural support for programming languages and operating systems
The Rio file cache: surviving operating system crashes
Proceedings of the seventh international conference on Architectural support for programming languages and operating systems
An orthogonally persistent Java
ACM SIGMOD Record
Device State Recovery in Non-volatile Main Memory Systems
COMPSAC '03 Proceedings of the 27th Annual International Conference on Computer Software and Applications
PS-algol: an algol with a persistent heap
ACM SIGPLAN Notices
Microreboot — A technique for cheap recovery
OSDI'04 Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - Volume 6
Bigtable: a distributed storage system for structured data
OSDI '06 Proceedings of the 7th USENIX Symposium on Operating Systems Design and Implementation - Volume 7
Dynamo: amazon's highly available key-value store
Proceedings of twenty-first ACM SIGOPS symposium on Operating systems principles
The end of an architectural era: (it's time for a complete rewrite)
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
H-store: a high-performance, distributed main memory transaction processing system
Proceedings of the VLDB Endowment
Architecting phase change memory as a scalable dram alternative
Proceedings of the 36th annual international symposium on Computer architecture
Scalable high performance main memory system using phase-change memory technology
Proceedings of the 36th annual international symposium on Computer architecture
Better I/O through byte-addressable, persistent memory
Proceedings of the ACM SIGOPS 22nd symposium on Operating systems principles
Otherworld: giving applications a chance to survive OS kernel crashes
Proceedings of the 5th European conference on Computer systems
Mnemosyne: lightweight persistent memory
Proceedings of the sixteenth international conference on Architectural support for programming languages and operating systems
NV-Heaps: making persistent objects fast and safe with next-generation, non-volatile memories
Proceedings of the sixteenth international conference on Architectural support for programming languages and operating systems
Rethinking the library OS from the top down
Proceedings of the sixteenth international conference on Architectural support for programming languages and operating systems
Consistent and durable data structures for non-volatile byte-addressable memory
FAST'11 Proceedings of the 9th USENIX conference on File and stroage technologies
NV-process: a fault-tolerance process model based on non-volatile memory
Proceedings of the Asia-Pacific Workshop on Systems
NV-process: a fault-tolerance process model based on non-volatile memory
APSys'12 Proceedings of the Third ACM SIGOPS Asia-Pacific conference on Systems
Fine-grained fault tolerance using device checkpoints
Proceedings of the eighteenth international conference on Architectural support for programming languages and operating systems
Proceedings of the 8th ACM European Conference on Computer Systems
RapiLog: reducing system complexity through verification
Proceedings of the 8th ACM European Conference on Computer Systems
Bridging the programming gap between persistent and volatile memory using WrAP
Proceedings of the ACM International Conference on Computing Frontiers
SCMFS: A File System for Storage Class Memory and its Extensions
ACM Transactions on Storage (TOS)
Exploring storage class memory with key value stores
Proceedings of the 1st Workshop on Interactions of NVM/FLASH with Operating Systems and Workloads
MetaData persistence using storage class memory: experiences with flash-backed DRAM
Proceedings of the 1st Workshop on Interactions of NVM/FLASH with Operating Systems and Workloads
Kiln: closing the performance gap between systems with and without persistence support
Proceedings of the 46th Annual IEEE/ACM International Symposium on Microarchitecture
Underprovisioning backup power infrastructure for datacenters
Proceedings of the 19th international conference on Architectural support for programming languages and operating systems
Hi-index | 0.00 |
Today's databases and key-value stores commonly keep all their data in main memory. A single server can have over 100 GB of memory, and a cluster of such servers can have 10s to 100s of TB. However, a storage back end is still required for recovery from failures. Recovery can last for minutes for a single server or hours for a whole cluster, causing heavy load on the back end. Non-volatile main memory (NVRAM) technologies can help by allowing near-instantaneous recovery of in-memory state. However, today's software does not support this well. Block-based approaches such as persistent buffer caches suffer from data duplication and block transfer overheads. Recently, user-level persistent heaps have been shown to have much better performance than these. However they require substantial application modification and still have significant runtime overheads. This paper proposes whole-system persistence (WSP) as an alternative. WSP is aimed at systems where all memory is non-volatile. It transparently recovers an application's entire state, making a failure appear as a suspend/resume event. Runtime overheads are eliminated by using "flush on fail": transient state in processor registers and caches is flushed to NVRAM only on failure, using the residual energy from the system power supply. Our evaluation shows that this approach has 1.6--13 times better runtime performance than a persistent heap, and that flush-on-fail can complete safely within 2--35\% of the residual energy window provided by standard power supplies.