PROQID: partial restarts of queries in distributed databases
Proceedings of the 17th ACM conference on Information and knowledge management
The design and implementation of OGSA-DQP: A service-based distributed query processor
Future Generation Computer Systems
Adaptive workload allocation in query processing in autonomous heterogeneous environments
Distributed and Parallel Databases
Autonomic query parallelization using non-dedicated computers: an evaluation of adaptivity options
The VLDB Journal — The International Journal on Very Large Data Bases
Failure resilient real-time data federation system
SpringSim '09 Proceedings of the 2009 Spring Simulation Multiconference
Adding dynamism to OGSA-DQP: incorporating the DynaSOAr framework in distributed query processing
Euro-Par'06 Proceedings of the CoreGRID 2006, UNICORE Summit 2006, Petascale Computational Biology and Bioinformatics conference on Parallel processing
Fault-tolerant query processing in structured P2P-systems
Distributed and Parallel Databases
An efficient skew-insensitive algorithm for join processing on grid architectures
Proceedings of the fifth international workshop on High-level parallel programming and applications
Insertion and querying mechanism for a distributed XML database system
Proceedings of the 5th ACM COMPUTE Conference: Intelligent & scalable system technologies
Hi-index | 0.00 |
Fault-tolerance has long been a feature of database systems, with transactions supporting the structuring of applications so as to ensure continuation of updating applications in spite of machine failures. For read-only queries the perceived wisdom has been that support for fault-tolerance is too expensive to be worthwhile. Distributed query processing is coming to be seen as a promising way of implementing applications that combine structured data and analysis operations in dynamic distributed settings such as computational grids. Such a query may be long-running and having to redo the whole query after a failure may cause problems (e.g. if the result may trigger business or safety critical activities). This work describes and evaluates a new scheme for adding fault-tolerance to distributed query processing through a rollback-recovery mechanism. The high level expression of user requests in a physical algebra offers opportunities for tuning the fault-tolerance provision so as to reduce the cost, and give better performance than employment of generic fault-tolerance mechanisms at the lowest level of query processing. This paper outlines how the publicly-available OGSA-DQP computational grid-based distributed query processing system can be modified to include support for fault-tolerance and presents a performance evaluation which includes measurements of the cost of both protocol overheads and rollback-recovery, for a set of example distributed queries.