Warding off the dangers of data corruption with amulet

Authors:
Nedyalko Borisov;Shivnath Babu;Nagapramod Mandagere;Sandeep Uttamchandani
Affiliations:
Duke University, Durham, NC, USA;Duke University, Durham, NC, USA;IBM Almaden Research, San Jose, CA, USA;IBM Almaden Research, San Jose, CA, USA
Venue:
Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
Year:
2011

Citing 9
Cited 3

Disk Scrubbing in Large Archival Storage Systems

MASCOTS '04 Proceedings of the The IEEE Computer Society's 12th Annual International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunications Systems
Towards a robust query optimizer: a principled and practical approach

Proceedings of the 2005 ACM SIGMOD international conference on Management of data
Parity lost and parity regained

FAST'08 Proceedings of the 6th USENIX Conference on File and Storage Technologies
An analysis of data corruption in the storage stack

FAST'08 Proceedings of the 6th USENIX Conference on File and Storage Technologies
DRAM errors in the wild: a large-scale field study

Proceedings of the eleventh international joint conference on Measurement and modeling of computer systems
Declarative management in Microsoft SQL server

Proceedings of the VLDB Endowment
End-to-end data integrity for file systems: a ZFS case study

FAST'10 Proceedings of the 8th USENIX conference on File and storage technologies
SQCK: a declarative file system checker

OSDI'08 Proceedings of the 8th USENIX conference on Operating systems design and implementation
Disaster recovery as a cloud service: economic benefits & deployment challenges

HotCloud'10 Proceedings of the 2nd USENIX conference on Hot topics in cloud computing

Definition, detection, and recovery of single-page failures, a fourth class of database failures

Proceedings of the VLDB Endowment
Self-diagnosing and self-healing indexes

DBTest '12 Proceedings of the Fifth International Workshop on Testing Database Systems
Rapid experimentation for testing and tuning a production database deployment

Proceedings of the 16th International Conference on Extending Database Technology

Quantified Score

Hi-index	0.00

Visualization

Abstract

Occasional corruption of stored data is an unfortunate byproduct of the complexity of modern systems. Hardware errors, software bugs, and mistakes by human administrators can corrupt important sources of data. The dominant practice to deal with data corruption today involves administrators writing ad hoc scripts that run data-integrity tests at the application, database, file-system, and storage levels. This manual approach is tedious, error-prone, and provides no understanding of the potential system unavailability and data loss if a corruption were to occur. We introduce the Amulet system that addresses the problem of verifying the correctness of stored data proactively and continuously. To our knowledge, Amulet is the first system that: (i) gives administrators a declarative language to specify their objectives regarding the detection and repair of data corruption; (ii) contains optimization and execution algorithms to ensure that the administrator's objectives are met robustly and with least cost, e.g., using pay-as-you cloud resources; and (iii) provides timely notification when corruption is detected, allowing proactive repair of corruption before it impacts users and applications. We describe the implementation and a comprehensive evaluation of Amulet for a database software stack deployed on an infrastructure-as-a-service cloud provider.