Adapting to source properties in processing data integration queries

Authors:
Zachary G. Ives;Alon Y. Halevy;Daniel S. Weld
Affiliations:
University of Pennsylvania;University of Washington;University of Washington
Venue:
SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
Year:
2004

Citing 24
Cited 37

Optimization of dynamic query evaluation plans

SIGMOD '94 Proceedings of the 1994 ACM SIGMOD international conference on Management of data
Adaptive selectivity estimation using query feedback

SIGMOD '94 Proceedings of the 1994 ACM SIGMOD international conference on Management of data
Online aggregation

SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
Efficient mid-query re-optimization of sub-optimal query execution plans

SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Cost-based query scrambling for initial delays

SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Ripple joins for online aggregation

SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
An adaptive query execution system for data integration

SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Query optimization in the presence of limited access patterns

SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Eddies: continuously adaptive query processing

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Continuously adaptive continuous queries over streams

Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Exploiting statistics on query expressions for optimization

Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Fast incremental maintenance of approximate histograms

ACM Transactions on Database Systems (TODS)
Optimizing Queries Across Diverse Data Sources

VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
Hash Joins and Hash Teams in Microsoft SQL Server

VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
A Parallel Processing Strategy for Evaluating Recursive Queries

VLDB '86 Proceedings of the 12th International Conference on Very Large Data Bases
Dynamic Pipeline Scheduling for Improving Interactive Query Performance

Proceedings of the 27th International Conference on Very Large Data Bases
LEO - DB2's LEarning Optimizer

Proceedings of the 27th International Conference on Very Large Data Bases
Including Group-By in Query Optimization

VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Sampling-Based Estimation of the Number of Distinct Values of an Attribute

VLDB '95 Proceedings of the 21th International Conference on Very Large Data Bases
An XML query engine for network-bound data

The VLDB Journal — The International Journal on Very Large Data Bases
Query processing and optimization in Oracle Rdb

The VLDB Journal — The International Journal on Very Large Data Bases
Answering queries using views: A survey

The VLDB Journal — The International Journal on Very Large Data Bases
Efficient query processing for data integration

Efficient query processing for data integration
Tuple routing strategies for distributed eddies

VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29

Proactive re-optimization

Proceedings of the 2005 ACM SIGMOD international conference on Management of data
Content-based routing: different plans for different data

VLDB '05 Proceedings of the 31st international conference on Very large data bases
Approximate quantiles and the order of the stream

Proceedings of the twenty-fifth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Data integration: the teenage years

VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Query optimization over web services

VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Mobile join operators for restricted sources

Mobile Information Systems
Query suspend and resume

Proceedings of the 2007 ACM SIGMOD international conference on Management of data
Query processing over incomplete autonomous databases

VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Request Window: an approach to improve throughput of RDBMS-based data integration system by utilizing data sharing across concurrent distributed queries

VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Cardinality estimation for the optimization of queries on ontologies

ACM SIGMOD Record
Adaptive query processing

Foundations and Trends in Databases
Automaton in or out: run-time plan optimization for XML stream processing

SSPS '08 Proceedings of the 2nd international workshop on Scalable stream processing system
A monitoring service for large-scale dynamic query optimisation in a grid environment

International Journal of Web and Grid Services
Joining the results of heterogeneous search engines

Information Systems
A strategy to develop adaptive and interactive query brokers

IDEAS '08 Proceedings of the 2008 international symposium on Database engineering & applications
Optimization of multi-domain queries on the web

Proceedings of the VLDB Endowment
Time-completeness trade-offs in record linkage using adaptive query processing

Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology
GCIP: exploiting the generation and optimization of integration processes

Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology
Adaptive workload allocation in query processing in autonomous heterogeneous environments

Distributed and Parallel Databases
Autonomic query parallelization using non-dedicated computers: an evaluation of adaptivity options

The VLDB Journal — The International Journal on Very Large Data Bases
Optimization and Execution of Complex Scientific Queries over Uncorrelated Experimental Data

SSDBM 2009 Proceedings of the 21st International Conference on Scientific and Statistical Database Management
Evolution of Query Optimization Methods: From Centralized Database Systems to Data Grid Systems

DEXA '09 Proceedings of the 20th International Conference on Database and Expert Systems Applications
Query processing over incomplete autonomous databases: query rewriting using learned data dependencies

The VLDB Journal — The International Journal on Very Large Data Bases
Adaptive join processing in pipelined plans

Proceedings of the 13th International Conference on Extending Database Technology
Extending postgreSQL to support distributed/heterogeneous query processing

DASFAA'07 Proceedings of the 12th international conference on Database systems for advanced applications
Dynamic query optimisation: towards decentralised methods

International Journal of Intelligent Information and Database Systems
Query performance evaluation of an architecture for fine-grained integration of heterogeneous grid data sources

Future Generation Computer Systems
Cluster-and-conquer: hierarchical multi-metric query processing in large-scale database federations

Proceedings of the Fourteenth International Database Engineering & Applications Symposium
Policy-based management and sharing of sensitive information among government agencies

MILCOM'06 Proceedings of the 2006 IEEE conference on Military communications
Linked data query processing strategies

ISWC'10 Proceedings of the 9th international semantic web conference on The semantic web - Volume Part I
A mobile relational algebra

Mobile Information Systems
Run-time adaptivity for search computing

Search computing
Adapting to changing resource performance in grid query processing

DMG 2005 Proceedings of the First VLDB conference on Data Management in Grids
Progressive query optimization for federated queries

EDBT'06 Proceedings of the 10th international conference on Advances in Database Technology
Chapter 10: join methods and query optimization

Search Computing
Utility-driven adaptive query workload execution

Future Generation Computer Systems
Efficient optimization and processing for distributed monitoring and control applications

PhD '12 Proceedings of the on SIGMOD/PODS 2012 PhD Symposium

Quantified Score

Hi-index	0.00

Visualization

Abstract

An effective query optimizer finds a query plan that exploits the characteristics of the source data. In data integration, little is known in advance about sources' properties, which necessitates the use of adaptive query processing techniques to adjust query processing on-the-fly. Prior work in adaptive query processing has focused on compensating for delays and adjusting for mis-estimated cardinality or selectivity values. In this paper, we present a generalized architecture for adaptive query processing and introduce a new technique, called adaptive data partitioning (ADP), which is based on the idea of dividing the source data into regions, each executed by different, complementary plans. We show how this model can be applied in novel ways to not only correct for underestimated selectivity and cardinality values, but also to discover and exploit order in the source data, and to detect and exploit source data that can be effectively pre-aggregated. We experimentally compare a number of alternative strategies and show that our approach is effective.