Parity logging with reserved space: towards efficient updates and recovery in erasure-coded clustered storage

Authors:
Jeremy C. W. Chan;Qian Ding;Patrick P. C. Lee;Helen H. W. Chan
Affiliations:
The Chinese University of Hong Kong;The Chinese University of Hong Kong;The Chinese University of Hong Kong;The Chinese University of Hong Kong
Venue:
FAST'14 Proceedings of the 12th USENIX conference on File and Storage Technologies
Year:
2014

Citing 39
Cited 0

Scale and performance in a distributed file system

ACM Transactions on Computer Systems (TOCS)
The design and implementation of a log-structured file system

ACM Transactions on Computer Systems (TOCS)
Parity logging overcoming the small write problem in redundant disk arrays

ISCA '93 Proceedings of the 20th annual international symposium on computer architecture
RAID: high-performance, reliable secondary storage

ACM Computing Surveys (CSUR)
The TickerTAIP parallel RAID architecture

ACM Transactions on Computer Systems (TOCS)
The Zebra striped network file system

ACM Transactions on Computer Systems (TOCS)
Striping in a RAID level 5 disk array

Proceedings of the 1995 ACM SIGMETRICS joint international conference on Measurement and modeling of computer systems
Improving the performance of log-structured file systems with adaptive methods

Proceedings of the sixteenth ACM symposium on Operating systems principles
OceanStore: an architecture for global-scale persistent storage

ASPLOS IX Proceedings of the ninth international conference on Architectural support for programming languages and operating systems
Erasure Coding Vs. Replication: A Quantitative Comparison

IPTPS '01 Revised Papers from the First International Workshop on Peer-to-Peer Systems
A performance comparison of RAID-5 and log-structured arrays

HPDC '95 Proceedings of the 4th IEEE International Symposium on High Performance Distributed Computing
The Google file system

SOSP '03 Proceedings of the nineteenth ACM symposium on Operating systems principles
A Decentralized Algorithm for Erasure-Coded Virtual Disks

DSN '04 Proceedings of the 2004 International Conference on Dependable Systems and Networks
Trace-based analyses and optimizations for network storage servers

Trace-based analyses and optimizations for network storage servers
Using Erasure Codes Efficiently for Storage in a Distributed System

DSN '05 Proceedings of the 2005 International Conference on Dependable Systems and Networks
Ursa minor: versatile cluster-based storage

FAST'05 Proceedings of the 4th conference on USENIX Conference on File and Storage Technologies - Volume 4
Total recall: system support for automated availability management

NSDI'04 Proceedings of the 1st conference on Symposium on Networked Systems Design and Implementation - Volume 1
File system logging versus clustering: a performance comparison

TCON'95 Proceedings of the USENIX 1995 Technical Conference Proceedings
PARAID: a gear-shifting power-aware RAID

FAST '07 Proceedings of the 5th USENIX conference on File and Storage Technologies
Pergamum: replacing tape with energy efficient, reliable, disk-based archival storage

FAST'08 Proceedings of the 6th USENIX Conference on File and Storage Technologies
Scalable performance of the Panasas parallel file system

FAST'08 Proceedings of the 6th USENIX Conference on File and Storage Technologies
Design tradeoffs for SSD performance

ATC'08 USENIX 2008 Annual Technical Conference on Annual Technical Conference
Write off-loading: Practical power management for enterprise storage

ACM Transactions on Storage (TOS)
A performance evaluation and examination of open-source erasure coding libraries for storage

FAST '09 Proccedings of the 7th conference on File and storage technologies
Reconstruct versus read-modify writes in RAID

Information Processing Letters
Optimal recovery of single disk failure in RDP code storage systems

Proceedings of the ACM SIGMETRICS international conference on Measurement and modeling of computer systems
Availability in globally distributed storage systems

OSDI'10 Proceedings of the 9th USENIX conference on Operating systems design and implementation
AONT-RS: blending security and performance in dispersed storage systems

FAST'11 Proceedings of the 9th USENIX conference on File and stroage technologies
Pond: the oceanstore prototype

FAST'03 Proceedings of the 2nd USENIX conference on File and storage technologies
Row-diagonal parity for double disk failure correction

FAST'04 Proceedings of the 3rd USENIX conference on File and storage technologies
Fast crash recovery in RAMCloud

SOSP '11 Proceedings of the Twenty-Third ACM Symposium on Operating Systems Principles
Windows Azure Storage: a highly available cloud storage service with strong consistency

SOSP '11 Proceedings of the Twenty-Third ACM Symposium on Operating Systems Principles
RAID6L: A log-assisted RAID6 storage architecture with improved write performance

MSST '11 Proceedings of the 2011 IEEE 27th Symposium on Mass Storage Systems and Technologies
High availability in DHTs: erasure coding vs. replication

IPTPS'05 Proceedings of the 4th international conference on Peer-to-Peer Systems
Analysis of Workload Behavior in Scientific and Historical Long-Term Data Repositories

ACM Transactions on Storage (TOS)
Rethinking erasure codes for cloud file systems: minimizing I/O for recovery and degraded reads

FAST'12 Proceedings of the 10th USENIX conference on File and Storage Technologies
Erasure coding in windows azure storage

USENIX ATC'12 Proceedings of the 2012 USENIX conference on Annual Technical Conference
Two Efficient Partial-Updating Schemes for Erasure-Coded Storage Clusters

NAS '12 Proceedings of the 2012 IEEE Seventh International Conference on Networking, Architecture, and Storage
XORing elephants: novel erasure codes for big data

Proceedings of the VLDB Endowment

Quantified Score

Hi-index	0.00

Visualization

Abstract

Many modern storage systems adopt erasure coding to provide data availability guarantees with low redundancy. Log-based storage is often used to append new data rather than overwrite existing data so as to achieve high update efficiency, but introduces significant I/O overhead during recovery due to reassembling updates from data and parity chunks. We propose parity logging with reserved space, which comprises two key design features: (1) it takes a hybrid of in-place data updates and log-based parity updates to balance the costs of updates and recovery, and (2) it keeps parity updates in a reserved space next to the parity chunk to mitigate disk seeks. We further propose a workload-aware scheme to dynamically predict and adjust the reserved space size. We prototype an erasure-coded clustered storage system called CodFS, and conduct testbed experiments on different update schemes under synthetic and real-world workloads. We show that our proposed update scheme achieves high update and recovery performance, which cannot be simultaneously achieved by pure in-place or log-based update schemes.