SIGMOD '87 Proceedings of the 1987 ACM SIGMOD international conference on Management of data
Measurements of a distributed file system
SOSP '91 Proceedings of the thirteenth ACM symposium on Operating systems principles
Caching in large-scale distributed file systems
Caching in large-scale distributed file systems
Informed prefetching and caching
SOSP '95 Proceedings of the fifteenth ACM symposium on Operating systems principles
The HP AutoRAID hierarchical storage system
ACM Transactions on Computer Systems (TOCS) - Special issue on operating system principles
A trace-driven analysis of the UNIX 4.2 BSD file system
Proceedings of the tenth ACM symposium on Operating systems principles
Minerva: An automated resource provisioning tool for large-scale storage systems
ACM Transactions on Computer Systems (TOCS)
Design and Implementation of a Predictive File Prefetching Algorithm
Proceedings of the General Track: 2002 USENIX Annual Technical Conference
SOSP '03 Proceedings of the nineteenth ACM symposium on Operating systems principles
Quickly finding near-optimal storage designs
ACM Transactions on Computer Systems (TOCS)
Improving Recoverability in Multi-tier Storage Systems
DSN '07 Proceedings of the 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks
Bigtable: a distributed storage system for structured data
OSDI '06 Proceedings of the 7th symposium on Operating systems design and implementation
Migrating server storage to SSDs: analysis of tradeoffs
Proceedings of the 4th ACM European conference on Computer systems
GFS: evolution on fast-forward
Communications of the ACM
Cost effective storage using extent based dynamic tiering
FAST'11 Proceedings of the 9th USENIX conference on File and stroage technologies
Cloud resource usage: extreme distributions invalidating traditional capacity planning models
Proceedings of the 2nd international workshop on Scientific cloud computing
FAST'12 Proceedings of the 10th USENIX conference on File and Storage Technologies
Projecting disk usage based on historical trends in a cloud environment
Proceedings of the 3rd workshop on Scientific Cloud Computing Date
Hi-index | 0.00 |
Janus is a system for partitioning the flash storage tier between workloads in a cloud-scale distributed file system with two tiers, flash storage and disk. The file system stores newly created files in the flash tier and moves them to the disk tier using either a First-In-First-Out (FIFO) policy or a Least-Recently-Used (LRU) policy, subject to per-workload allocations. Janus constructs compact metrics of the cacheability of the different workloads, using sampled distributed traces because of the large scale of the system. From these metrics, we formulate and solve an optimization problem to determine the flash allocation to workloads that maximizes the total reads sent to the flash tier, subject to operator-set priorities and bounds on flash write rates. Using measurements from production workloads in multiple data centers using these recommendations, as well as traces of other production workloads, we show that the resulting allocation improves the flash hit rate by 47-76% compared to a unified tier shared by all workloads. Based on these results and an analysis of several thousand production workloads, we conclude that flash storage is a cost-effective complement to disks in data centers.