Efficiently Ordering Query Plans for Data Integration

Authors:
Affiliations:
Venue:
ICDE '02 Proceedings of the 18th International Conference on Data Engineering
Year:
2002

Citing 0
Cited 9

A Frequency-based Approach for Mining Coverage Statistics in Data Integration

ICDE '04 Proceedings of the 20th International Conference on Data Engineering
Challenges in selecting paths for navigational queries: trade-off of benefit of path versus cost of plan

Proceedings of the 7th International Workshop on the Web and Databases: colocated with ACM SIGMOD/PODS 2004
Schema mediation for large-scale semantic data sharing

The VLDB Journal — The International Journal on Very Large Data Bases
Effectively Mining and Using Coverage and Overlap Statistics for Data Integration

IEEE Transactions on Knowledge and Data Engineering
BibFinder/StatMiner: effectively mining and using coverage and overlap statistics in data integration

VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Query Planning for Searching Inter-dependent Deep-Web Databases

SSDBM '08 Proceedings of the 20th international conference on Scientific and Statistical Database Management
A top-down approach for compressing data cubes under the simultaneous evaluation of multiple hierarchical range queries

Journal of Intelligent Information Systems
Source selection in large scale data contexts: an optimization approach

DEXA'10 Proceedings of the 21st international conference on Database and expert systems applications: Part I
Improving source selection in large scale mediation systems through combinatorial optimization techniques

Transactions on large-scale data- and knowledge-centered systems III

Quantified Score

Hi-index	0.00

Visualization

Abstract

The goal of a data integration system is to provide a uniform interface to a multitude of data sources. Given a user query formulated in this interface, the system translates it into a set of query plans. Each plan is a query formulated over the data sources, and specifies a way to access sources and combine data to answer the user query.In practice, when the number of sources is large, a data-integration system must generate and execute many query plans with significantly varying utilities. Hence, it is crucial that the system finds the best plans efficiently and executes them first, to guarantee acceptable time to and the quality of the first answers. We describe efficient solutions to this problem. First, we formally define the problem of ordering query plans. Second, we identify several interesting structural properties of the problem and describe three ordering algorithms that exploit these properties. Finally, we describe experimental results that suggest guidance on which algorithms perform best under which conditions.