An adaptive query execution system for data integration

Authors:
Zachary G. Ives;Daniela Florescu;Marc Friedman;Alon Levy;Daniel S. Weld
Affiliations:
University of Washington;INRIA Roquencourt;University of Washington;University of Washington;University of Washington
Venue:
SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Year:
1999

Citing 25
Cited 142

Query evaluation techniques for large databases

ACM Computing Surveys (CSUR)
Optimization of parallel query execution plans in XPRS

Distributed and Parallel Databases - Selected papers from the first international conference on parallel and distributed information systems
Optimization of dynamic query evaluation plans

SIGMOD '94 Proceedings of the 1994 ACM SIGMOD international conference on Management of data
Carnot and InfoSleuth: database technology and the World Wide Web

SIGMOD '95 Proceedings of the 1995 ACM SIGMOD international conference on Management of data
Query caching and optimization in distributed mediator systems

SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
Data access for the masses through OLE DB

SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
Query reformulation for dynamic information integration

Journal of Intelligent Information Systems - Special issue on intelligent integration of information
Multidatabase Query Optimization

Distributed and Parallel Databases
The TSIMMIS Approach to Mediation: Data Models and Languages

Journal of Intelligent Information Systems - Special issue: next generation information technologies and systems
Efficient mid-query re-optimization of sub-optimal query execution plans

SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Cost-based query scrambling for initial delays

SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Integration of heterogeneous databases without common domains using queries based on textual similarity

SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Memory-adaptive scheduling for large query execution

Proceedings of the seventh international conference on Information and knowledge management
Memory allocation strategies for complex decision support queries

Proceedings of the seventh international conference on Information and knowledge management
Principles of distributed database systems (2nd ed.)

Principles of distributed database systems (2nd ed.)
Decomposition—a strategy for query processing

ACM Transactions on Database Systems (TODS)
Dataflow query execution in a parallel main-memory environment

PDIS '91 Proceedings of the first international conference on Parallel and distributed information systems
Scaling Access to Heterogeneous Data Sources with DISCO

IEEE Transactions on Knowledge and Data Engineering
Fusion Queries over Internet Databases

EDBT '98 Proceedings of the 6th International Conference on Extending Database Technology: Advances in Database Technology
Optimizing Queries Across Diverse Data Sources

VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
Heterogeneous Database Query Optimization in DB2 Universal DataJoiner

VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Querying Heterogeneous Information Sources Using Source Descriptions

VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
Using Probabilistic Information in Data Integration

VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
Mariposa: a wide-area distributed database system

The VLDB Journal — The International Journal on Very Large Data Bases
Query processing and optimization in Oracle Rdb

The VLDB Journal — The International Journal on Very Large Data Bases

Navigational plans for data integration

AAAI '99/IAAI '99 Proceedings of the sixteenth national conference on Artificial intelligence and the eleventh Innovative applications of artificial intelligence conference innovative applications of artificial intelligence
An efficient plan execution system for information management agents

Proceedings of the 2nd international workshop on Web information and data management
Selectively materializing data in mediators by analyzing source structure, query distribution and maintenance cost

Proceedings of the 2nd international workshop on Web information and data management
Query containment for data integration systems

PODS '00 Proceedings of the nineteenth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Eddies: continuously adaptive query processing

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Dataflow plan execution for software agents

AGENTS '00 Proceedings of the fourth international conference on Autonomous agents
The state of the art in distributed query processing

ACM Computing Surveys (CSUR)
Generating efficient plans for queries using views

SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Reconciling schemas of disparate data sources: a machine-learning approach

SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Answering queries with useful bindings

ACM Transactions on Database Systems (TODS)
Models and issues in data stream systems

Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Answering queries using views with arithmetic comparisons

Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Characterizing memory requirements for queries over continuous data streams

Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Rate-based query optimization for streaming information sources

Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Continuously adaptive continuous queries over streams

Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Efficient evaluation of queries in a mediator for WebSources

Proceedings of the 2002 ACM SIGMOD international conference on Management of data
A scalable hash ripple join algorithm

Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Partial results for online query processing

Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Logic-based techniques in data integration

Logic-based artificial intelligence
XClust: clustering XML schemas for effective integration

Proceedings of the eleventh international conference on Information and knowledge management
Join and multi-join processing in data integration systems

Data & Knowledge Engineering
Continuous queries over data streams

ACM SIGMOD Record
Learning to Match the Schemas of Data Sources: A Multistrategy Approach

Machine Learning
CPU and incremental memory allocation in dynamic parallelization of SQL Queries

Parallel Computing
A Unified Peer-to-Peer Database Framework for Scalable Service and Resource Discovery

GRID '02 Proceedings of the Third International Workshop on Grid Computing
Performance Comparison of Pipelined Hash Joins on Workstation Clusters

HiPC '02 Proceedings of the 9th International Conference on High Performance Computing
Identification of Syntactically Similar DTD Elements for Schema Matching

WAIM '01 Proceedings of the Second International Conference on Advances in Web-Age Information Management
Dynamic Pipeline Scheduling for Improving Interactive Query Performance

Proceedings of the 27th International Conference on Very Large Data Bases
Adaptive Query Processing: A Survey

BNCOD 19 Proceedings of the 19th British National Conference on Databases: Advances in Databases
Efficient Querying of Distributed Resources in Mediator Systems

On the Move to Meaningful Internet Systems, 2002 - DOA/CoopIS/ODBASE 2002 Confederated International Conferences DOA, CoopIS and ODBASE 2002
A Semantic Approach to XML-based Data Integration

ER '01 Proceedings of the 20th International Conference on Conceptual Modeling: Conceptual Modeling
dbRouter - A Scaleable and Distributed Query Optimization and Processing Framework

DEXA '02 Proceedings of the 13th International Conference on Database and Expert Systems Applications
An XML query engine for network-bound data

The VLDB Journal — The International Journal on Very Large Data Bases
ObjectGlobe: Ubiquitous query processing on the Internet

The VLDB Journal — The International Journal on Very Large Data Bases
Answering queries using views: A survey

The VLDB Journal — The International Journal on Very Large Data Bases
On producing join results early

Proceedings of the twenty-second ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Getting from here to there: interactive planning and agent execution for optimizing travel

Eighteenth national conference on Artificial intelligence
Query containment for data integration systems

Journal of Computer and System Sciences - Special issue on PODS 2000
Approximate join processing over data streams

Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Chain: operator scheduling for memory minimization in data stream systems

Proceedings of the 2003 ACM SIGMOD international conference on Management of data
The design of an acquisitional query processor for sensor networks

Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Dynamic sample selection for approximate query processing

Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Quality of service in an information economy

ACM Transactions on Internet Technology (TOIT)
Aurora: a new model and architecture for data stream management

The VLDB Journal — The International Journal on Very Large Data Bases
Computing complete answers to queries in the presence of limited access patterns

The VLDB Journal — The International Journal on Very Large Data Bases
Distributed query adaptation and its trade-offs

Proceedings of the 2003 ACM symposium on Applied computing
Building XML statistics for the hidden web

CIKM '03 Proceedings of the twelfth international conference on Information and knowledge management
MJoin: a metadata-aware stream join operator

Proceedings of the 2nd international workshop on Distributed event-based systems
Query Processing and Optimization on the Web

Distributed and Parallel Databases
Hash-Merge Join: A Non-blocking Join Algorithm for Producing Fast and Early Join Results

ICDE '04 Proceedings of the 20th International Conference on Data Engineering
Adaptive stream resource management using Kalman Filters

SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
Adapting to source properties in processing data integration queries

SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
Adaptive ordering of pipelined stream filters

SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
Static optimization of conjunctive queries with sliding windows over infinite streams

SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
BioFast: challenges in exploring linked life sciences sources

ACM SIGMOD Record
Composable XML integration grammars

Proceedings of the thirteenth ACM international conference on Information and knowledge management
Building Scalable Electronic Market Places Using HyperQuery-Based Distributed Query Processing

World Wide Web
Operator scheduling in data stream systems

The VLDB Journal — The International Journal on Very Large Data Bases
Semantic Approximation of Data Stream Joins

IEEE Transactions on Knowledge and Data Engineering
Self-monitoring query execution for adaptive query processing

Data & Knowledge Engineering
TinyDB: an acquisitional query processing system for sensor networks

ACM Transactions on Database Systems (TODS) - Special Issue: SIGMOD/PODS 2003
DAIMON: data integration for a mobile network

Proceedings of the 4th ACM international workshop on Data engineering for wireless and mobile access
Proactive re-optimization

Proceedings of the 2005 ACM SIGMOD international conference on Management of data
An adaptable distributed query processing architecture

Data & Knowledge Engineering
Content-based routing: different plans for different data

VLDB '05 Proceedings of the 31st international conference on Very large data bases
Early hash join: a configurable algorithm for the efficient and early production of join results

VLDB '05 Proceedings of the 31st international conference on Very large data bases
Mapping maintenance for data integration systems

VLDB '05 Proceedings of the 31st international conference on Very large data bases
Composing, optimizing, and executing plans for bioinformatics web services

The VLDB Journal — The International Journal on Very Large Data Bases
Semantic-integration research in the database community

AI Magazine - Special issue on semantic integration
Optimizing Cyclic Join View Maintenance over Distributed Data Sources

IEEE Transactions on Knowledge and Data Engineering
Virtual XML: a toolbox and use cases for the XML world view

IBM Systems Journal
Data integration: the teenage years

VLDB '06 Proceedings of the 32nd international conference on Very large data bases
NSJ: an efficient non-blocking spatial join algorithm

GIS '06 Proceedings of the 14th annual ACM international symposium on Advances in geographic information systems
The Sort-Merge-Shrink join

ACM Transactions on Database Systems (TODS)
Incremental Evaluation of Sliding-Window Queries over Data Streams

IEEE Transactions on Knowledge and Data Engineering
Mobile join operators for restricted sources

Mobile Information Systems
Using views to generate efficient evaluation plans for queries

Journal of Computer and System Sciences
The effect of reading policy on early join result production

Information Sciences: an International Journal
Monitoring streams: a new class of data management applications

VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Progressive merge join: a generic and non-blocking sort-based join algorithm

VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Towards a query optimizer for text-centric tasks

ACM Transactions on Database Systems (TODS)
Maximizing the output rate of multi-way join queries over streaming information sources

VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Scheduling for shared window joins over data streams

VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Lifting the burden of history from adaptive query processing

VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
To share or not to share?

VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Update exchange with mappings and provenance

VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Request Window: an approach to improve throughput of RDBMS-based data integration system by utilizing data sharing across concurrent distributed queries

VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Adaptive query processing

Foundations and Trends in Databases
Speculative plan execution for information gathering

Artificial Intelligence
Query optimization via contention space partitioning and cost error controlling for dynamic multidatabase systems

Distributed and Parallel Databases
A monitoring service for large-scale dynamic query optimisation in a grid environment

International Journal of Web and Grid Services
Hybrid query processing through services composition

Ph.D. '08 Proceedings of the 2008 EDBT Ph.D. workshop
Using slice join for efficient evaluation of multi-way joins

Data & Knowledge Engineering
Query Planning for Searching Inter-dependent Deep-Web Databases

SSDBM '08 Proceedings of the 20th international conference on Scientific and Statistical Database Management
Multiple-Objective Compression of Data Cubes in Cooperative OLAP Environments

ADBIS '08 Proceedings of the 12th East European conference on Advances in Databases and Information Systems
A strategy to develop adaptive and interactive query brokers

IDEAS '08 Proceedings of the 2008 international symposium on Database engineering & applications
Learning to create data-integrating queries

Proceedings of the VLDB Endowment
Data fusion

ACM Computing Surveys (CSUR)
A quality-aware optimizer for information extraction

ACM Transactions on Database Systems (TODS)
Adaptive workload allocation in query processing in autonomous heterogeneous environments

Distributed and Parallel Databases
Input-sensitive scalable continuous join query processing

ACM Transactions on Database Systems (TODS)
Cost-Based Vectorization of Instance-Based Integration Processes

ADBIS '09 Proceedings of the 13th East European Conference on Advances in Databases and Information Systems
A Vision for Next Generation Query Processors and an Associated Research Agenda

Globe '09 Proceedings of the 2nd International Conference on Data Management in Grid and Peer-to-Peer Systems
Evolution of Query Optimization Methods: From Centralized Database Systems to Data Grid Systems

DEXA '09 Proceedings of the 20th International Conference on Database and Expert Systems Applications
An expressive language and efficient execution system for software agents

Journal of Artificial Intelligence Research
Getting from here to there: interactive planning and agent execution for optimizing travel

IAAI'02 Proceedings of the 14th conference on Innovative applications of artificial intelligence - Volume 1
Creating a RFID data integration framework for enterprise information systems

International Journal of Internet Protocol Technology
$\mathcal{I}$-SQE: A Query Engine for Answering Range Queries over Incomplete Spatial Databases

KES '09 Proceedings of the 13th International Conference on Knowledge-Based and Intelligent Information and Engineering Systems: Part II
Reasoning on Incompleteness of Spatial Information for Effectively and Efficiently Answering Range Queries over Incomplete Spatial Databases

FQAS '09 Proceedings of the 8th International Conference on Flexible Query Answering Systems
Optimizing joins in a map-reduce environment

Proceedings of the 13th International Conference on Extending Database Technology
Sharing mobile databases in dynamically configurable environments

CAiSE'03 Proceedings of the 15th international conference on Advanced information systems engineering
A top-down approach for compressing data cubes under the simultaneous evaluation of multiple hierarchical range queries

Journal of Intelligent Information Systems
Querying a super-peer in a schema-based super-peer network

DBISP2P'05/06 Proceedings of the 2005/2006 international conference on Databases, information systems, and peer-to-peer computing
Extending postgreSQL to support distributed/heterogeneous query processing

DASFAA'07 Proceedings of the 12th international conference on Database systems for advanced applications
Top-down compression of data cubes in the presence of simultaneous multiple hierarchical range queries

ISMIS'08 Proceedings of the 17th international conference on Foundations of intelligent systems
Dynamic query optimisation: towards decentralised methods

International Journal of Intelligent Information and Database Systems
Cluster-and-conquer: hierarchical multi-metric query processing in large-scale database federations

Proceedings of the Fourteenth International Database Engineering & Applications Symposium
Cost-based vectorization of instance-based integration processes

Information Systems
Preference query evaluation over expensive attributes

CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
R-MESHJOIN for near-real-time data warehousing

DOLAP '10 Proceedings of the ACM 13th international workshop on Data warehousing and OLAP
Ad-hoc distributed spatial joins on mobile devices

IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
Just-in-time data integration in action

Proceedings of the VLDB Endowment
Determinacy and query rewriting for conjunctive queries and views

Theoretical Computer Science
A mobile relational algebra

Mobile Information Systems
Determinacy and rewriting of conjunctive queries over unary database schemas

Proceedings of the 2011 ACM Symposium on Applied Computing
How soccer players would do stream joins

Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
Enhancing accuracy and expressive power of range query answers over incomplete spatial databases via a novel reasoning approach

Data & Knowledge Engineering
The research and implementation of heterogeneous data integration under ontology mapping mechanism

WISM'11 Proceedings of the 2011 international conference on Web information systems and mining - Volume Part II
Query decomposition using the XML declarative description language

ICCSA'05 Proceedings of the 2005 international conference on Computational Science and Its Applications - Volume Part II
Join algorithm using multiple replicas in data grid

WAIM'05 Proceedings of the 6th international conference on Advances in Web-Age Information Management
A unifying framework for merging and evaluating XML information

DASFAA'05 Proceedings of the 10th international conference on Database Systems for Advanced Applications
A foundation for the replacement of pipelined physical join operators in adaptive query processing

EDBT'06 Proceedings of the 2006 international conference on Current Trends in Database Technology
Efficient load balancing in partitioned queries under random perturbations

ACM Transactions on Autonomous and Adaptive Systems (TAAS) - Special section on formal methods in pervasive computing, pervasive adaptation, and self-adaptive systems: Models and algorithms
Efficiently updating cost repository values for query optimization on web data sources in a mediator/wrapper environment

NGITS'06 Proceedings of the 6th international conference on Next Generation Information Technologies and Systems
Chapter 12: panta rhei: flexible execution engine for search computing queries

Search Computing
Rewriting conjunctive queries determined by views

MFCS'07 Proceedings of the 32nd international conference on Mathematical Foundations of Computer Science
Processing global XQuery queries based on static query decomposition

ISPA'07 Proceedings of the 5th international conference on Parallel and Distributed Processing and Applications
Optimizing adaptive multi-route query processing via time-partitioned indices

Journal of Computer and System Sciences
HYBRIDJOIN for Near-Real-Time Data Warehousing

International Journal of Data Warehousing and Mining
Driver input selection for main-memory multi-way joins

Proceedings of the 28th Annual ACM Symposium on Applied Computing
Optimised X-HYBRIDJOIN for near-real-time data warehousing

ADC '12 Proceedings of the Twenty-Third Australasian Database Conference - Volume 124
A generic front-stage for semi-stream processing

Proceedings of the 22nd ACM international conference on Conference on information & knowledge management

Quantified Score

Hi-index	0.00

Visualization

Abstract

Query processing in data integration occurs over network-bound, autonomous data sources. This requires extensions to traditional optimization and execution techniques for three reasons: there is an absence of quality statistics about the data, data transfer rates are unpredictable and bursty, and slow or unavailable data sources can often be replaced by overlapping or mirrored sources. This paper presents the Tukwila data integration system, designed to support adaptivity at its core using a two-pronged approach. Interleaved planning and execution with partial optimization allows Tukwila to quickly recover from decisions based on inaccurate estimates. During execution, Tukwila uses adaptive query operators such as the double pipelined hash join, which produces answers quickly, and the dynamic collector, which robustly and efficiently computes unions across overlapping data sources. We demonstrate that the Tukwila architecture extends previous innovations in adaptive execution (such as query scrambling, mid-execution re-optimization, and choose nodes), and we present experimental evidence that our techniques result in behavior desirable for a data integration system.