Automatic Recovery from Disk Failure in Continuous-Media Servers

Authors:
Jack Y. B. Lee;John C. S. Lui
Affiliations:
Chinese Univ. of Hong Kong, Shatin, Hong Kong;Chinese Univ. of Hong Kong, Shatin, Hong Kong
Venue:
IEEE Transactions on Parallel and Distributed Systems
Year:
2002

Citing 18
Cited 9

A case for redundant arrays of inexpensive disks (RAID)

SIGMOD '88 Proceedings of the 1988 ACM SIGMOD international conference on Management of data
Distributed sparing in disk arrays

COMPCON '92 Proceedings of the thirty-seventh international conference on COMPCON
Streaming RAID: a disk array management system for video files

MULTIMEDIA '93 Proceedings of the first ACM international conference on Multimedia
Comparing rebuild algorithms for mirrored and RAID5 disk arrays

SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
I/O issues in a multimedia system

Computer
RAID: high-performance, reliable secondary storage

ACM Computing Surveys (CSUR)
Using rotational mirrored declustering for replica placement in a disk-array-based video server

Proceedings of the third ACM international conference on Multimedia
Fault tolerant design of multimedia servers

SIGMOD '95 Proceedings of the 1995 ACM SIGMOD international conference on Management of data
Issues in the design of a storage server for video-on-demand

Multimedia Systems
Fault-tolerant architectures for continuous media servers

SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
Hot mirroring: a method of hiding parity update penalty and degradation during rebuilds for RAID5

SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
Segmented information dispersal (SID) for efficient reconstruction in fault-tolerant video servers

MULTIMEDIA '96 Proceedings of the fourth ACM international conference on Multimedia
Random duplicated assignment: an alternative to striping in video servers

MULTIMEDIA '97 Proceedings of the fifth ACM international conference on Multimedia
Multimedia Storage Servers: A Tutorial

Computer
RAID5 Performance with Distributed Sparing

IEEE Transactions on Parallel and Distributed Systems
Performance Analysis of RAID5 Disk Arrays with a Vacationing Server Model for Rebuild Mode Operation

Proceedings of the Tenth International Conference on Data Engineering
Rebuild options in RAID5 disk arrays

SPDP '95 Proceedings of the 7th IEEE Symposium on Parallel and Distributeed Processing
Pipelined Disk Arrays for Digital Movie Retrieval

ICMCS '95 Proceedings of the International Conference on Multimedia Computing and Systems

WorkOut: I/O workload outsourcing for boosting RAID reconstruction performance

FAST '09 Proccedings of the 7th conference on File and storage technologies
Optimal recovery of single disk failure in RDP code storage systems

Proceedings of the ACM SIGMETRICS international conference on Measurement and modeling of computer systems
A reliability model of energy-efficient parallel disk systems with data mirroring

International Journal of High Performance Systems Architecture
Online availability upgrades for parity-based RAIDs through supplementary parity augmentations

ACM Transactions on Storage (TOS)
Victim disk first: an asymmetric cache to boost the performance of disk arrays under faulty conditions

USENIXATC'11 Proceedings of the 2011 USENIX conference on USENIX annual technical conference
A Hybrid Approach to Failed Disk Recovery Using RAID-6 Codes: Algorithms and Performance Evaluation

ACM Transactions on Storage (TOS)
Rebuild processing in RAID5 with emphasis on the supplementary parity augmentation method[37]

ACM SIGARCH Computer Architecture News
A reliability optimization method for RAID-structured storage systems based on active data migration

Journal of Systems and Software
IDO: intelligent data outsourcing with improved RAID reconstruction performance in large-scale data centers

lisa'12 Proceedings of the 26th international conference on Large Installation System Administration: strategies, tools, and techniques

Quantified Score

Hi-index	0.00

Visualization

Abstract

Continuous-media (CM) servers have been around for some years. Apart from server capacity, another important issue in the deployment of CM servers is reliability. This study investigates rebuild algorithms for automatically rebuilding data stored in a failed disk into a spare disk. Specifically, a block-based rebuild algorithm is studied with the rebuild time and buffer requirement modeled. A buffer-sharing scheme is then proposed to eliminate the additional buffers needed by the rebuild process. To further improve rebuild performance, a track-based rebuild algorithm that rebuilds lost data in tracks is proposed and analyzed. Results show that track-based rebuild, while it substantially outperforms block-based rebuild, requires significantly more buffers (17-135 percent more) even with buffer sharing. To tackle this problem, a novel pipelined rebuild algorithm is proposed to take advantage of the sequential property of track retrievals to pipeline the reading and writing processes. This pipelined rebuild algorithm achieves the same rebuild performance as track-based rebuild, but reduces the extra buffer requirement to insignificant levels (0.7-1.9 percent). Numerical results computed using models of five commercial disk drives demonstrate that automatic rebuild of a failed disk can be done in a reasonable amount of time, even at relatively high server utilization (e.g., less than 1.5 hours at 90 percent utilization).