The log-structured merge-tree (LSM-tree)
Acta Informatica
A large-scale study of file-system contents
SIGMETRICS '99 Proceedings of the 1999 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
An empirical analysis of techniques for constructing and searching k-dimensional trees
Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
Practical File System Design with the Be File System
Practical File System Design with the Be File System
C-store: a column-oriented DBMS
VLDB '05 Proceedings of the 31st international conference on Very large data bases
The case for a wide-table approach to manage sparse relational data sets
Proceedings of the 2007 ACM SIGMOD international conference on Management of data
Bigtable: a distributed storage system for structured data
OSDI '06 Proceedings of the 7th USENIX Symposium on Operating Systems Design and Implementation - Volume 7
A five-year study of file-system metadata
FAST '07 Proceedings of the 5th USENIX conference on File and Storage Technologies
Measurement and analysis of large-scale network file system workloads
ATC'08 USENIX 2008 Annual Technical Conference on Annual Technical Conference
Generating realistic impressions for file-system benchmarking
FAST '09 Proccedings of the 7th conference on File and storage technologies
Spyglass: fast, scalable metadata search for large-scale storage systems
FAST '09 Proccedings of the 7th conference on File and storage technologies
Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis
Cassandra: a decentralized structured storage system
ACM SIGOPS Operating Systems Review
In search of an API for scalable file systems: under the table or above it?
HotCloud'09 Proceedings of the 2009 conference on Hot topics in cloud computing
Pantheon: exascale file system search for scientific computing
SSDBM'11 Proceedings of the 23rd international conference on Scientific and statistical database management
Efficient, Modular Metadata Management with Loris
NAS '11 Proceedings of the 2011 IEEE Sixth International Conference on Networking, Architecture, and Storage
FastQuery: A Parallel Indexing System for Scientific Data
CLUSTER '11 Proceedings of the 2011 IEEE International Conference on Cluster Computing
Hi-index | 0.00 |
While file system metadata is well characterized by a variety of workload studies, scientific metadata is much less well understood. We characterize scientific metadata, in order to better understand the implications for index design. Based on our findings, existing solutions for either file system or scientific search will not suffice for indexing a large scientific file system. We describe the problems with existing solutions, and suggest column stores as an alternative approach.