CRAID: online RAID upgrades using dynamic hot data reorganization

Authors:
A. Miranda;T. Cortes
Affiliations:
Barcelona Supercomputing Center;Barcelona Supercomputing Center and Technical University of Catalonia
Venue:
FAST'14 Proceedings of the 12th USENIX conference on File and Storage Technologies
Year:
2014

Citing 39
Cited 0

A case for redundant arrays of inexpensive disks (RAID)

SIGMOD '88 Proceedings of the 1988 ACM SIGMOD international conference on Management of data
A system for adaptive disk rearrangement

Software—Practice & Experience
RAID: high-performance, reliable secondary storage

ACM Computing Surveys (CSUR)
Adaptive block rearrangement

ACM Transactions on Computer Systems (TOCS)
Striping in a RAID level 5 disk array

Proceedings of the 1995 ACM SIGMETRICS joint international conference on Measurement and modeling of computer systems
The HP AutoRAID hierarchical storage system

ACM Transactions on Computer Systems (TOCS) - Special issue on operating system principles
DCD—disk caching disk: a new approach for boosting I/O performance

ISCA '96 Proceedings of the 23rd annual international symposium on Computer architecture
Efficient, distributed data placement strategies for storage area networks (extended abstract)

Proceedings of the twelfth annual ACM symposium on Parallel algorithms and architectures
Evaluating content management techniques for Web proxy caches

ACM SIGMETRICS Performance Evaluation Review
Minimizing Expected Head Movement in One-Dimensional and Two-Dimensional Mass Storage Systems

ACM Computing Surveys (CSUR)
A Caching Strategy to Improve iSCSI Performance

LCN '02 Proceedings of the 27th Annual IEEE Conference on Local Computer Networks
SCADDAR: An Efficient Randomized Technique to Reorganize Continuous Media Blocks

ICDE '02 Proceedings of the 18th International Conference on Data Engineering
A Simple Way to Estimate the Cost of Downtime

LISA '02 Proceedings of the 16th USENIX conference on System administration
Efficient disk replacement and data migration algorithms for large disk subsystems

ACM Transactions on Storage (TOS)
ARC: A Self-Tuning, Low Overhead Replacement Cache

FAST '03 Proceedings of the 2nd USENIX Conference on File and Storage Technologies
Passive NFS Tracing of Email and Research Workloads

FAST '03 Proceedings of the 2nd USENIX Conference on File and Storage Technologies
Hibernator: helping disk arrays sleep through the winter

Proceedings of the twentieth ACM symposium on Operating systems principles
The automatic improvement of locality in storage systems

ACM Transactions on Computer Systems (TOCS)
EERAID: energy efficient redundant and inexpensive disk array

Proceedings of the 11th workshop on ACM SIGOPS European workshop
Increasing the capacity of RAID5 by online gradual assimilation

SNAPI '04 Proceedings of the international workshop on Storage network architecture and parallel I/Os
CRUSH: controlled, scalable, decentralized placement of replicated data

Proceedings of the 2006 ACM/IEEE conference on Supercomputing
SLAS: An efficient approach to scaling round-robin striped volumes

ACM Transactions on Storage (TOS)
Cost-aware WWW proxy caching algorithms

USITS'97 Proceedings of the USENIX Symposium on Internet Technologies and Systems on USENIX Symposium on Internet Technologies and Systems
The design and implementation of a DCD device driver for Unix

ATEC '99 Proceedings of the annual conference on USENIX Annual Technical Conference
Design tradeoffs for SSD performance

ATC'08 USENIX 2008 Annual Technical Conference on Annual Technical Conference
Measurement and analysis of large-scale network file system workloads

ATC'08 USENIX 2008 Annual Technical Conference on Annual Technical Conference
Write off-loading: Practical power management for enterprise storage

ACM Transactions on Storage (TOS)
Migrating server storage to SSDs: analysis of tradeoffs

Proceedings of the 4th ACM European conference on Computer systems
BORG: block-reORGanization for self-optimizing storage systems

FAST '09 Proccedings of the 7th conference on File and storage technologies
SRCMap: energy proportional storage using dynamic consolidation

FAST'10 Proceedings of the 8th USENIX conference on File and storage technologies
FastScale: accelerate RAID scaling by minimizing data migration

FAST'11 Proceedings of the 9th USENIX conference on File and stroage technologies
C-Miner: mining block correlations in storage systems

FAST'04 Proceedings of the 3rd USENIX conference on File and storage technologies
Autonomic storage system based on automatic learning

HiPC'04 Proceedings of the 11th international conference on High Performance Computing
Reliable and randomized data distribution strategies for large scale storage systems

HIPC '11 Proceedings of the 2011 18th International Conference on High Performance Computing
Analyzing Long-Term Access Locality to Find Ways to Improve Distributed Storage Systems

PDP '12 Proceedings of the 2012 20th Euromicro International Conference on Parallel, Distributed and Network-based Processing
Data allocation in MEMS-based mobile storage devices

IEEE Transactions on Consumer Electronics
G-MST: A dynamic group-based scheduling algorithm for MEMS-based mobile storage devices

IEEE Transactions on Consumer Electronics
GreedyDual* Web caching algorithm: exploiting the two sources of temporal locality in Web request streams

Computer Communications
GSR: A Global Stripe-Based Redistribution Approach to Accelerate RAID-5 Scaling

ICPP '12 Proceedings of the 2012 41st International Conference on Parallel Processing

Quantified Score

Hi-index	0.00

Visualization

Abstract

Current algorithms used to upgrade RAID arrays typically require large amounts of data to be migrated, even those that move only the minimum amount of data required to keep a balanced data load. This paper presents CRAID, a self-optimizing RAID array that performs an online block reorganization of frequently used, long-term accessed data in order to reduce this migration even further. To achieve this objective, CRAID tracks frequently used, long-term data blocks and copies them to a dedicated partition spread across all the disks in the array. When new disks are added, CRAID only needs to extend this process to the new devices to redistribute this partition, thus greatly reducing the overhead of the upgrade process. In addition, the reorganized access patterns within this partition improve the array's performance, amortizing the copy overhead and allowing CRAID to offer a performance competitive with traditional RAIDs. We describe CRAID's motivation and design and we evaluate it by replaying seven real-world workloads including a file server, a web server and a user share. Our experiments show that CRAID can successfully detect hot data variations and begin using new disks as soon as they are added to the array. Also, the usage of a dedicated partition improves the sequentiality of relevant data access, which amortizes the cost of reorganizations. Finally, we prove that a full-HDD CRAID array with a small distributed partition (