An integrated approach to recovery and high availability in an updatable, distributed data warehouse

Authors:
Edmond Lau;Samuel Madden
Affiliations:
MIT CSAIL;MIT CSAIL
Venue:
VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Year:
2006

Citing 22
Cited 15

Transaction management in the R* distributed database management system

ACM Transactions on Database Systems (TODS)
Maintaining availability in partitioned replicated databases

ACM Transactions on Database Systems (TODS)
Replication in the harp file system

SOSP '91 Proceedings of the thirteenth ACM symposium on Operating systems principles
ARIES: a transaction recovery method supporting fine-granularity locking and partial rollbacks using write-ahead logging

ACM Transactions on Database Systems (TODS)
An efficient scheme for providing high availability

SIGMOD '92 Proceedings of the 1992 ACM SIGMOD international conference on Management of data
A critique of ANSI SQL isolation levels

SIGMOD '95 Proceedings of the 1995 ACM SIGMOD international conference on Management of data
Concurrency Control in Distributed Database Systems

ACM Computing Surveys (CSUR)
Transaction Processing: Concepts and Techniques

Transaction Processing: Concepts and Techniques
Access path selection in a relational database management system

SIGMOD '79 Proceedings of the 1979 ACM SIGMOD international conference on Management of data
Nonblocking commit protocols

SIGMOD '81 Proceedings of the 1981 ACM SIGMOD international conference on Management of data
Implementation techniques for main memory database systems

SIGMOD '84 Proceedings of the 1984 ACM SIGMOD international conference on Management of data
Optimizing Queries with Materialized Views

ICDE '95 Proceedings of the Eleventh International Conference on Data Engineering
Group Commit Timers and High Volume Transaction Systems

Proceedings of the 2nd International Workshop on High Performance Transaction Systems
The ClustRa Telecom Database: High Availability, High Throughput, and Real-Time Response

VLDB '95 Proceedings of the 21th International Conference on Very Large Data Bases
Weighted voting for replicated data

SOSP '79 Proceedings of the seventh ACM symposium on Operating systems principles
The failure and recovery problem for replicated databases

PODC '83 Proceedings of the second annual ACM symposium on Principles of distributed computing
Non-Intrusive, Parallel Recovery of Replicated Data

SRDS '02 Proceedings of the 21st IEEE Symposium on Reliable Distributed Systems
Are quorums an alternative for data replication?

ACM Transactions on Database Systems (TODS)
C-store: a column-oriented DBMS

VLDB '05 Proceedings of the 31st international conference on Very large data bases
Integrating compression and execution in column-oriented database systems

Proceedings of the 2006 ACM SIGMOD international conference on Management of data
Fine-grained failover using connection migration

USITS'01 Proceedings of the 3rd conference on USENIX Symposium on Internet Technologies and Systems - Volume 3
Database Replication

Database Replication

The end of an architectural era: (it's time for a complete rewrite)

VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Supporting amnesia in log-based recovery protocols

EATIS '07 Proceedings of the 2007 Euro American conference on Telematics and information systems
OLTP through the looking glass, and what we found there

Proceedings of the 2008 ACM SIGMOD international conference on Management of data
A case for flash memory ssd in enterprise database applications

Proceedings of the 2008 ACM SIGMOD international conference on Management of data
A Right-Time Refresh for XML Data Warehouses

DASFAA '09 Proceedings of the 14th International Conference on Database Systems for Advanced Applications
Consistency-aware evaluation of OLAP queries in replicated data warehouses

Proceedings of the ACM twelfth international workshop on Data warehousing and OLAP
An evaluation of checkpoint recovery for massively multiplayer online games

Proceedings of the VLDB Endowment
Towards high performance and high availability clusters of archived stream

APWeb/WAIM'07 Proceedings of the joint 9th Asia-Pacific web and 8th international conference on web-age information management conference on Advances in data and web management
The case for determinism in database systems

Proceedings of the VLDB Endowment
Fast checkpoint recovery algorithms for frequently consistent applications

Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
MOVIES: indexing moving objects by shooting index images

Geoinformatica
Elastic SI-Cache: consistent and scalable caching in multi-tier architectures

The VLDB Journal — The International Journal on Very Large Data Bases
Efficient logging for enterprise workloads on column-oriented in-memory databases

Proceedings of the 21st ACM international conference on Information and knowledge management
ProRea: live database migration for multi-tenant RDBMS with snapshot isolation

Proceedings of the 16th International Conference on Extending Database Technology
MLC-flash-friendly logging and recovery for databases

Proceedings of the 28th Annual ACM Symposium on Applied Computing

Quantified Score

Hi-index	0.00

Visualization

Abstract

Any highly available data warehouse will use some form of data replication to tolerate machine failures. In this paper, we demonstrate that we can leverage this data redundancy to build an integrated approach to recovery and high availability. Our approach, called HARBOR, revives a crashed site by querying remote, online sites for missing updates and uses timestamps to determine which tuples need to be copied or updated. HARBOR does not require a stable log, recovers without quiescing the system, allows replicated data to be stored non-identically, and is simpler than a log-based recovery algorithm.We compare the runtime overhead and recovery performance of HARBOR to those of two-phase commit and ARIES, the gold standard for log-based recovery, on a three-node distributed database system. Our experiments demonstrate that HARBOR suffers lower runtime overhead, has recovery performance comparable to ARIES's, and can tolerate the fault of a worker and efficiently bring it back online.