Joint evaluation of recovery and performance of a COTS DBMS in the presence of operator faults

Authors:
Marco Vieira;Henrique Madeira
Affiliations:
ISEC/CISUC, Polytechnic Institute of Coimbra, 3031 Coimbra, Portugal;DEI/CISUC, University of Coimbra, 3030 Coimbra, Portugal
Venue:
Performance Evaluation - Dependable systems and networks-performance and dependability symposium (DSN-PDS) 2002: Selected papers
Year:
2004

Citing 14
Cited 0

A guide to the SQL standard (2nd ed.)

A guide to the SQL standard (2nd ed.)
The relational model for database management: version 2

The relational model for database management: version 2
Database management systems

Database management systems
A relational model of data for large shared data banks

Communications of the ACM
Transaction Processing: Concepts and Techniques

Transaction Processing: Concepts and Techniques
Software Dependability in the Tandem GUARDIAN System

IEEE Transactions on Software Engineering
Integrating Reliable Memory in Databases

VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
Joint Evaluation of Performance and Robustness of a COTS DBMS through Fault-Injection

DSN '00 Proceedings of the 2000 International Conference on Dependable Systems and Networks (formerly FTCS-30 and DCCA-8)
A Framework for Database Audit and Control Flow Checking for a Wireless Telephone Network Controller

DSN '01 Proceedings of the 2001 International Conference on Dependable Systems and Networks (formerly: FTCS)
Workshop on Dependability Benchmarking

DSN '02 Proceedings of the 2002 International Conference on Dependable Systems and Networks
How Fail-Stop are Faulty Programs?

FTCS '98 Proceedings of the The Twenty-Eighth Annual International Symposium on Fault-Tolerant Computing
Evaluating the Effectiveness of Fault Tolerance in Replicated Database Management Systems

FTCS '99 Proceedings of the Twenty-Ninth Annual International Symposium on Fault-Tolerant Computing
Failure Data Analysis of a LAN of Windows NT Based Computers

SRDS '99 Proceedings of the 18th IEEE Symposium on Reliable Distributed Systems
Towards availability benchmarks: a case study of software raid systems

ATEC '00 Proceedings of the annual conference on USENIX Annual Technical Conference

Quantified Score

Hi-index	0.00

Visualization

Abstract

A major cause of failures in large database management systems (DBMS) is operator/administrator faults. Although most of the complex DBMS available today have comprehensive recovery mechanisms, the effectiveness of these mechanisms is difficult to characterize. On the other hand, the tuning of a large database is very complex and database administrators tend to concentrate on performance tuning and disregard the recovery mechanisms. Above all, database administrators seldom have feedback on how good a given configuration is concerning recovery. This paper proposes an experimental approach to characterize both the performance and the recoverability of DBMS. Our approach is presented through a concrete example of benchmarking the performance and recovery of an Oracle DBMS running the standard TPC-C benchmark, extended to include two new elements: a faultload based on operator faults and measures related to recoverability. A classification of operator/administrator faults in DBMS is proposed. A set of tools have been designed and built to reproduce operator faults in an Oracle 8i DBMS, using exactly the same means used in the field by the real database administrator. This experimental approach is generic (i.e., can be applied to any DBMS) and is fully automatic. The paper ends with the discussion of the results and the proposal of guidelines to help database administrators in finding the balance between performance and recovery tuning.