MediSyn: a synthetic streaming media service workload generator
NOSSDAV '03 Proceedings of the 13th international workshop on Network and operating systems support for digital audio and video
A five-year study of file-system metadata
ACM Transactions on Storage (TOS)
Scalable performance of the Panasas parallel file system
FAST'08 Proceedings of the 6th USENIX Conference on File and Storage Technologies
Spyglass: fast, scalable metadata search for large-scale storage systems
FAST '09 Proccedings of the 7th conference on File and storage technologies
Euro-Par '09 Proceedings of the 15th International Euro-Par Conference on Parallel Processing
Generating realistic impressions for file-system benchmarking
ACM Transactions on Storage (TOS)
GFS: evolution on fast-forward
Communications of the ACM
Scalable I/O tracing and analysis
Proceedings of the 4th Annual Workshop on Petascale Data Storage
Delay scheduling: a simple technique for achieving locality and fairness in cluster scheduling
Proceedings of the 5th European conference on Computer systems
Scale and concurrency of GIGA+: file system directories with millions of files
FAST'11 Proceedings of the 9th USENIX conference on File and stroage technologies
Benchmarking file system benchmarking: it *IS* rocket science
HotOS'13 Proceedings of the 13th USENIX conference on Hot topics in operating systems
Design implications for enterprise storage systems via multi-dimensional trace analysis
SOSP '11 Proceedings of the Twenty-Third ACM Symposium on Operating Systems Principles
DARE: Adaptive Data Replication for Efficient Cluster Scheduling
CLUSTER '11 Proceedings of the 2011 IEEE International Conference on Cluster Computing
Parallel I/O and the metadata wall
Proceedings of the sixth workshop on Parallel Data Storage
Extracting flexible, replayable models from large block traces
FAST'12 Proceedings of the 10th USENIX conference on File and Storage Technologies
PACMan: coordinated memory caching for parallel jobs
NSDI'12 Proceedings of the 9th USENIX conference on Networked Systems Design and Implementation
IISWC '12 Proceedings of the 2012 IEEE International Symposium on Workload Characterization (IISWC)
Generating request streams on Big Data using clustered renewal processes
Performance Evaluation
Hi-index | 0.00 |
Efficient namespace metadata management is increasingly important as next-generation file systems are designed for peta and exascales. New schemes have been proposed, however, their evaluation has been insufficient due to a lack of appropriate namespace metadata traces. Specifically, no Big Data storage system metadata trace is publicly available and existing ones are a poor replacement. We studied publicly available traces and one Big Data trace from Yahoo! and note some of the differences and their implications to metadata management studies. We discuss the insufficiency of existing evaluation approaches and present a first step towards a statistical metadata workload model that can capture the relevant characteristics of a workload and is suitable for synthetic workload generation. We describe Mimesis, a synthetic workload generator, and evaluate its usefulness through a case study in a least recently used metadata cache for the Hadoop Distributed File System. Simulation results show that the traces generated by Mimesis mimic the original workload and can be used in place of the real trace providing accurate results.