The Sprite Network Operating System
Computer
SOSP '91 Proceedings of the thirteenth ACM symposium on Operating systems principles
Integrating content-based access mechanisms with hierarchical file systems
OSDI '99 Proceedings of the third symposium on Operating systems design and implementation
A large-scale study of file-system contents
SIGMETRICS '99 Proceedings of the 1999 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
Deciding when to forget in the Elephant file system
Proceedings of the seventeenth ACM symposium on Operating systems principles
Signature files: an access method for documents and its analytical performance evaluation
ACM Transactions on Information Systems (TOIS)
Multidimensional binary search trees used for associative searching
Communications of the ACM
Space/time trade-offs in hash coding with allowable errors
Communications of the ACM
The K-D-B-tree: a search structure for large multidimensional dynamic indexes
SIGMOD '81 Proceedings of the 1981 ACM SIGMOD international conference on Management of data
A Query Processing Strategy for the Decomposed Storage Model
Proceedings of the Third International Conference on Data Engineering
VLDB '88 Proceedings of the 14th International Conference on Very Large Data Bases
Hourly analysis of a very large topically categorized web query log
Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
"One Size Fits All": An Idea Whose Time Has Come and Gone
ICDE '05 Proceedings of the 21st International Conference on Data Engineering
Metadata Efficiency in Versioning File Systems
FAST '03 Proceedings of the 2nd USENIX Conference on File and Storage Technologies
Diamond: A Storage Architecture for Early Discard in Interactive Search
FAST '04 Proceedings of the 3rd USENIX Conference on File and Storage Technologies
Fast on-line index construction by geometric partitioning
Proceedings of the 14th ACM international conference on Information and knowledge management
Optimizing bitmap indices with efficient compression
ACM Transactions on Database Systems (TODS)
Cache-oblivious streaming B-trees
Proceedings of the nineteenth annual ACM symposium on Parallel algorithms and architectures
A security model for full-text file system search in multi-user environments
FAST'05 Proceedings of the 4th conference on USENIX Conference on File and Storage Technologies - Volume 4
GLIMPSE: a tool to search through entire file systems
WTEC'94 Proceedings of the USENIX Winter 1994 Technical Conference on USENIX Winter 1994 Technical Conference
File system design for an NFS file server appliance
WTEC'94 Proceedings of the USENIX Winter 1994 Technical Conference on USENIX Winter 1994 Technical Conference
Ceph: a scalable, high-performance distributed file system
OSDI '06 Proceedings of the 7th USENIX Symposium on Operating Systems Design and Implementation - Volume 7
Provenance-aware storage systems
ATEC '06 Proceedings of the annual conference on USENIX '06 Annual Technical Conference
A five-year study of file-system metadata
FAST '07 Proceedings of the 5th USENIX conference on File and Storage Technologies
Embedded inodes and explicit grouping: exploiting disk bandwidth for small files
ATEC '97 Proceedings of the annual conference on USENIX Annual Technical Conference
The end of an architectural era: (it's time for a complete rewrite)
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Query-based partitioning of documents and indexes for information lifecycle management
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Column-stores vs. row-stores: how different are they really?
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Measurement and analysis of large-scale network file system workloads
ATC'08 USENIX 2008 Annual Technical Conference on Annual Technical Conference
SCAN-Lite: enterprise-wide analysis on the cheap
Proceedings of the 4th ACM European conference on Computer systems
Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis
Fusing data management services with file systems
Proceedings of the 4th Annual Workshop on Petascale Data Storage
In search of an API for scalable file systems: under the table or above it?
HotCloud'09 Proceedings of the 2009 conference on Hot topics in cloud computing
Finding a needle in Haystack: facebook's photo storage
OSDI'10 Proceedings of the 9th USENIX conference on Operating systems design and implementation
Just-in-time analytics on large file systems
FAST'11 Proceedings of the 9th USENIX conference on File and stroage technologies
Pantheon: exascale file system search for scientific computing
SSDBM'11 Proceedings of the 23rd international conference on Scientific and statistical database management
Scalable and Distributed Processing of Scientific XML Data
GRID '11 Proceedings of the 2011 IEEE/ACM 12th International Conference on Grid Computing
Toward efficient search for ultrascale storage systems
Proceedings of the first annual workshop on High performance computing meets databases
MalPEFinder: fast and retrospective assessment of data breaches in malware attacks
Security and Communication Networks
Metadata Traces and Workload Models for Evaluating Big Storage Systems
UCC '12 Proceedings of the 2012 IEEE/ACM Fifth International Conference on Utility and Cloud Computing
Examining extended and scientific metadata for scalable index designs
Proceedings of the 6th International Systems and Storage Conference
FAST: near real-time data analytics for the cloud
Proceedings of the 4th annual Symposium on Cloud Computing
Building workload-independent storage with VT-trees
FAST'13 Proceedings of the 11th USENIX conference on File and Storage Technologies
FAST'14 Proceedings of the 12th USENIX conference on File and Storage Technologies
Hi-index | 0.00 |
The scale of today's storage systems has made it increasingly difficult to find and manage files. To address this, we have developed Spyglass, a file metadata search system that is specially designed for large-scale storage systems. Using an optimized design, guided by an analysis of real-world metadata traces and a user study, Spyglass allows fast, complex searches over file metadata to help users and administrators better understand and manage their files. Spyglass achieves fast, scalable performance through the use of several novel metadata search techniques that exploit metadata search properties. Flexible index control is provided by an index partitioning mechanism that leverages namespace locality. Signature files are used to significantly reduce a query's search space, improving performance and scalability. Snapshot-based metadata collection allows incremental crawling of only modified files. A novel index versioning mechanism provides both fast index updates and "back-in-time" search of metadata. An evaluation of our Spyglass prototype using our real-world, large-scale metadata traces shows search performance that is 1-4 orders of magnitude faster than existing solutions. The Spyglass index can quickly be updated and typically requires less than 0.1%of disk space. Additionally, metadata collection is up to 10× faster than existing approaches.