A tradeoff analysis of delayed reconstruction for storage clusters

Authors:
Qiang Cao;Hongyan Li;Yan Yang;Makoto Takizawa;Naixue xiong
Affiliations:
Huazhong University of Sci. and Tech., Wuhan, Hubei, China;Huazhong University of Sci. and Tech., Wuhan, Hubei, China and Hubei University of Economics, Wuhan, China;Western Illinois University, Macomb, IL;Seikei University, Tokyo, Japan;Georgia State University, Atlanta, Georgia
Venue:
Proceedings of the 6th International Wireless Communications and Mobile Computing Conference
Year:
2010

Citing 3
Cited 0

A large-scale study of failures in high-performance computing systems

DSN '06 Proceedings of the International Conference on Dependable Systems and Networks
SafeStore: a durable and practical storage system

ATC'07 2007 USENIX Annual Technical Conference on Proceedings of the USENIX Annual Technical Conference
Are disks the dominant contributor for storage failures?: a comprehensive study of storage subsystem failure characteristics

FAST'08 Proceedings of the 6th USENIX Conference on File and Storage Technologies

Quantified Score

Hi-index	0.00

Visualization

Abstract

Considering a large part of node failures in a storage clusters cannot actually destroy data in disks and even some failed nodes can soon recover, a policy that deferring a reconstruction until recover during a certain time after a node failure can lessen unnecessary data rebuilding process is absolutely possible and favorable, but it also undoubtedly introduces a certain risk of data loss. In this paper, according to differences in the way setting delay time, we mainly present two algorithms of delaying reconstruction: static and dynamic. A qualitative approach is proposed to analyze the reliability, risk and benefit of these two methods. Numerical results show that under a certain distribution function of repair time for failed nodes, the static method exists an optimal delay time to leverage the risk and benefit. Moreover, the dynamic method has better risk control than static, but its cost is to increase the possibility of launching reconstruction.