Automated application-level checkpointing of MPI programs
Proceedings of the ninth ACM SIGPLAN symposium on Principles and practice of parallel programming
Portable Checkpointing for Heterogeneous Archtitectures
FTCS '97 Proceedings of the 27th International Symposium on Fault-Tolerant Computing (FTCS '97)
Application-level checkpointing for shared memory programs
ASPLOS XI Proceedings of the 11th international conference on Architectural support for programming languages and operating systems
Libckpt: transparent checkpointing under Unix
TCON'95 Proceedings of the USENIX 1995 Technical Conference Proceedings
The end of an architectural era: (it's time for a complete rewrite)
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
The case for RAMClouds: scalable high-performance storage entirely in DRAM
ACM SIGOPS Operating Systems Review
Behavioral simulations in MapReduce
Proceedings of the VLDB Endowment
Fast checkpoint recovery algorithms for frequently consistent applications
Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
Hi-index | 0.00 |
In this demonstration we present BRRL, a library for making distributed main-memory applications fault tolerant. BRRL is optimized for cloud applications with frequent points of consistency that use data-parallelism to avoid complex concurrency control mechanisms. BRRL differs from existing recovery libraries by providing a simple table abstraction and using schema information to optimize checkpointing. We will demonstrate the utility of BRRL using a distributed transaction processing system and a platform for scientific behavioral simulations.