The design and implementation of a log-structured file system
ACM Transactions on Computer Systems (TOCS)
SOSP '03 Proceedings of the nineteenth ACM symposium on Operating systems principles
Bigtable: a distributed storage system for structured data
OSDI '06 Proceedings of the 7th USENIX Symposium on Operating Systems Design and Implementation - Volume 7
To infinity and beyond: time-warped network emulation
NSDI'06 Proceedings of the 3rd conference on Networked Systems Design & Implementation - Volume 3
DieCast: testing distributed systems with an accurate scale model
NSDI'08 Proceedings of the 5th USENIX Symposium on Networked Systems Design and Implementation
Identifying hierarchical structure in sequences: a linear-time algorithm
Journal of Artificial Intelligence Research
The Hadoop Distributed File System
MSST '10 Proceedings of the 2010 IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST)
Emulating Goliath storage systems with David
FAST'11 Proceedings of the 9th USENIX conference on File and stroage technologies
Timing-accurate storage emulation
FAST'02 Proceedings of the 1st USENIX conference on File and storage technologies
Fast crash recovery in RAMCloud
SOSP '11 Proceedings of the Twenty-Third ACM Symposium on Operating Systems Principles
Windows Azure Storage: a highly available cloud storage service with strong consistency
SOSP '11 Proceedings of the Twenty-Third ACM Symposium on Operating Systems Principles
A universal algorithm for sequential data compression
IEEE Transactions on Information Theory
Compression of individual sequences via variable-rate coding
IEEE Transactions on Information Theory
OSDI'12 Proceedings of the 10th USENIX conference on Operating Systems Design and Implementation
Spanner: Google's globally-distributed database
OSDI'12 Proceedings of the 10th USENIX conference on Operating Systems Design and Implementation
Robustness in the Salus scalable block store
nsdi'13 Proceedings of the 10th USENIX conference on Networked Systems Design and Implementation
Hi-index | 0.00 |
This paper presents Exalt, a library that gives back to researchers the ability to test the scalability of today's large storage systems. To that end, we introduce Tardis, a data representation scheme that allows data to be identified and efficiently compressed even at low-level storage layers that are not aware of the semantics and formatting used by higher levels of the system. This compression enables a high degree of node colocation, which makes it possible to run large-scale experiments on as few as a hundred machines. Our experience with HDFS and HBase shows that, by allowing us to run the real system code at an unprecedented scale, Exalt can help identify scalability problems that are not observable at lower scales: in particular, Exalt helped us pinpoint and resolve issues in HDFS that improved its aggregate throughput by an order of magnitude.