Software—Practice & Experience
Measurements of a distributed file system
SOSP '91 Proceedings of the thirteenth ACM symposium on Operating systems principles
The design and implementation of a log-structured file system
ACM Transactions on Computer Systems (TOCS)
Petal: distributed virtual disks
Proceedings of the seventh international conference on Architectural support for programming languages and operating systems
A cost-effective, high-bandwidth storage architecture
Proceedings of the eighth international conference on Architectural support for programming languages and operating systems
SOSP '03 Proceedings of the nineteenth ACM symposium on Operating systems principles
Dynamic Metadata Management for Petabyte-Scale File Systems
Proceedings of the 2004 ACM/IEEE conference on Supercomputing
Boxwood: abstractions as the foundation for storage infrastructure
OSDI'04 Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - Volume 6
GLIMPSE: a tool to search through entire file systems
WTEC'94 Proceedings of the USENIX Winter 1994 Technical Conference on USENIX Winter 1994 Technical Conference
Cooperative caching: using remote client memory to improve file system performance
OSDI '94 Proceedings of the 1st USENIX conference on Operating Systems Design and Implementation
Scalability in the XFS file system
ATEC '96 Proceedings of the 1996 annual conference on USENIX Annual Technical Conference
Embedded inodes and explicit grouping: exploiting disk bandwidth for small files
ATEC '97 Proceedings of the annual conference on USENIX Annual Technical Conference
hFS: a hybrid file system prototype for improving small file and metadata performance
Proceedings of the 2nd ACM SIGOPS/EuroSys European Conference on Computer Systems 2007
Sinfonia: a new paradigm for building scalable distributed systems
Proceedings of twenty-first ACM SIGOPS symposium on Operating systems principles
Ceph: a scalable, high-performance distributed file system
OSDI '06 Proceedings of the 7th symposium on Operating systems design and implementation
Bigtable: A Distributed Storage System for Structured Data
ACM Transactions on Computer Systems (TOCS)
GIGA+: scalable directories for shared file systems
PDSW '07 Proceedings of the 2nd international workshop on Petascale data storage: held in conjunction with Supercomputing '07
Design of an object-based storage device based on I/O processor
ACM SIGOPS Operating Systems Review
PNUTS: Yahoo!'s hosted data serving platform
Proceedings of the VLDB Endowment
Object storage: the future building block for storage systems
LGDI '05 Proceedings of the 2005 IEEE International Symposium on Mass Storage Systems and Technology
Spyglass: fast, scalable metadata search for large-scale storage systems
FAST '09 Proccedings of the 7th conference on File and storage technologies
Improving throughput for small disk requests with proximal I/O
FAST'11 Proceedings of the 9th USENIX conference on File and stroage technologies
Scale and concurrency of GIGA+: file system directories with millions of files
FAST'11 Proceedings of the 9th USENIX conference on File and stroage technologies
Better never than late: meeting deadlines in datacenter networks
Proceedings of the ACM SIGCOMM 2011 conference
Inter-datacenter bulk transfers with netstitcher
Proceedings of the ACM SIGCOMM 2011 conference
Small cache, big effect: provable load balancing for randomly partitioned cluster services
Proceedings of the 2nd ACM Symposium on Cloud Computing
Characterizing and modelling popularity of user-generated videos
Performance Evaluation
SILT: a memory-efficient, high-performance key-value store
SOSP '11 Proceedings of the Twenty-Third ACM Symposium on Operating Systems Principles
Information-centric networking: seeing the forest for the trees
Proceedings of the 10th ACM Workshop on Hot Topics in Networks
Scalable real time data management for smart grid
Proceedings of the Middleware 2011 Industry Track Workshop
A case for RDMA in clouds: turning supercomputer networking into commodity
Proceedings of the Second Asia-Pacific Workshop on Systems
Workload analysis of a large-scale key-value store
Proceedings of the 12th ACM SIGMETRICS/PERFORMANCE joint international conference on Measurement and Modeling of Computer Systems
Understanding the effects and implications of compute node related failures in hadoop
Proceedings of the 21st international symposium on High-Performance Parallel and Distributed Computing
Optimizing cost and performance for content multihoming
Proceedings of the ACM SIGCOMM 2012 conference on Applications, technologies, architectures, and protocols for computer communication
Saving cash by using less cache
HotCloud'12 Proceedings of the 4th USENIX conference on Hot Topics in Cloud Ccomputing
HotStorage'12 Proceedings of the 4th USENIX conference on Hot Topics in Storage and File Systems
Comparing high-performance multi-core web-server architectures
Proceedings of the 5th Annual International Systems and Storage Conference
Optimizing cost and performance for content multihoming
ACM SIGCOMM Computer Communication Review - Special october issue SIGCOMM '12
Droplet: A Distributed Solution of Data Deduplication
GRID '12 Proceedings of the 2012 ACM/IEEE 13th International Conference on Grid Computing
Massive electronic records processing for digital archives in cloud
ICPCA/SWS'12 Proceedings of the 2012 international conference on Pervasive Computing and the Networked World
Storage and performance optimization of long tail key access in a social network
Proceedings of the 3rd International Workshop on Cloud Data and Platforms
CamCubeOS: a key-based network stack for 3D torus cluster topologies
Proceedings of the 22nd international symposium on High-performance parallel and distributed computing
COSBench: cloud object storage benchmark
Proceedings of the 4th ACM/SPEC International Conference on Performance Engineering
P3: toward privacy-preserving photo sharing
nsdi'13 Proceedings of the 10th USENIX conference on Networked Systems Design and Implementation
Kinship: efficient resource management for performance and functionally asymmetric platforms
Proceedings of the ACM International Conference on Computing Frontiers
Direct lookup and hash-based metadata placement for local file systems
Proceedings of the 6th International Systems and Storage Conference
Thin servers with smart pipes: designing SoC accelerators for memcached
Proceedings of the 40th Annual International Symposium on Computer Architecture
Integrating microsecond circuit switching into the data center
Proceedings of the ACM SIGCOMM 2013 conference on SIGCOMM
Less pain, most of the gain: incrementally deployable ICN
Proceedings of the ACM SIGCOMM 2013 conference on SIGCOMM
Storage-class memory needs flexible interfaces
Proceedings of the 4th Asia-Pacific Workshop on Systems
A comparison of two physical data designs for interactive social networking actions
Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Building confederated web-based services with Priv.io
Proceedings of the first ACM conference on Online social networks
Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles
ACM SIGOPS 24th Symposium on Operating Systems Principles
An analysis of Facebook photo caching
Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles
UpSizeR: Synthetically scaling an empirical relational database
Information Systems
Proceedings of the ninth ACM conference on Emerging networking experiments and technologies
Dynamic Partitioning-based JPEG Decompression on Heterogeneous Multicore Architectures
Proceedings of Programming Models and Applications on Multicores and Manycores
Shroud: ensuring private access to large-scale data in the data center
FAST'13 Proceedings of the 11th USENIX conference on File and Storage Technologies
Hi-index | 0.00 |
This paper describes Haystack, an object storage system optimized for Facebook's Photos application. Facebook currently stores over 260 billion images, which translates to over 20 petabytes of data. Users upload one billion new photos (∼60 terabytes) each week and Facebook serves over one million images per second at peak. Haystack provides a less expensive and higher performing solution than our previous approach, which leveraged network attached storage appliances over NFS. Our key observation is that this traditional design incurs an excessive number of disk operations because of metadata lookups. We carefully reduce this per photo metadata so that Haystack storage machines can perform all metadata lookups in main memory. This choice conserves disk operations for reading actual data and thus increases overall throughput.