Evaluation of Join Strategies for Distributed Mediation

Authors:
Vanja Josifovski;Timour Katchaounov;Tore Risch
Affiliations:
-;-;-
Venue:
ADBIS '01 Proceedings of the 5th East European Conference on Advances in Databases and Information Systems
Year:
2001

Citing 15
Cited 0

R* optimizer validation and performance evaluation for distributed queries

Readings in database systems
Myriad: design and implementation of a federated database prototype

Software—Practice & Experience
Query processing in Pegasus

Object-oriented multidatabase systems
An Adaptive Object-Oriented Approach to Integration and Access ofHeterogeneous Information Sources

Distributed and Parallel Databases
Query processing in a system for distributed databases (SDD-1)

ACM Transactions on Database Systems (TODS)
Using Semi-Joins to Solve Relational Queries

Journal of the ACM (JACM)
Functional Query Optimization over Object-Oriented Views for Data Integration

Journal of Intelligent Information Systems - Special issue on functional approach to intelligent information systems
Query Decomposition for a Distributed Object-Oriented Mediator System

Distributed and Parallel Databases
Scaling Access to Heterogeneous Data Sources with DISCO

IEEE Transactions on Knowledge and Data Engineering
Optimizing Queries Across Diverse Data Sources

VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
Cost Models DO Matter: Providing Cost Information for Diverse Data Sources in a Federated System

VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
Integrating Heterogenous Overlapping Databases through Object-Oriented Transformations

VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
The Volcano Optimizer Generator: Extensibility and Efficient Search

Proceedings of the Ninth International Conference on Data Engineering
Query processing over object views of relational data

The VLDB Journal — The International Journal on Very Large Data Bases
Optimization Algorithms for Distributed Queries

IEEE Transactions on Software Engineering

Quantified Score

Hi-index	0.00

Visualization

Abstract

Three join algorithms are evaluated in an environment with distributed main-memory based mediators and data sources. A streamed ship-out join ships bulks of tuples to a mediator near a data source, followed by post-processing in the client. An extended streamed semi-join in addition builds a main-memory hash index in the client mediator. A ship-in algorithm materializes and joins the data in the client mediator. The first two algorithms are suitable for sources that require parameters to execute a query, as web search engines and computational software, and the last is suitable otherwise. We compare the execution times for obtaining all and the first N tuples, and analyze the percentage time spent in subsystems, varying the network communication speed, bulk size, and data duplicates. The join algorithm leads to orders of magnitude performance difference in different mediation environments.