Answering aggregate queries in data exchange

Authors:
Foto Afrati;Phokion G. Kolaitis
Affiliations:
National Technical University of Athens, Athens, Greece;IBM Almaden Research Center, San Jose, CA, USA
Venue:
Proceedings of the twenty-seventh ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Year:
2008

Citing 18
Cited 22

Scalar aggregation in inconsistent databases

Theoretical Computer Science - Database theory
Generic Model Management: Concepts And Algorithms (Lecture Notes in Computer Science)

Generic Model Management: Concepts And Algorithms (Lecture Notes in Computer Science)
Data exchange: getting to the core

ACM Transactions on Database Systems (TODS) - Special Issue: SIGMOD/PODS 2003
XML data exchange: consistency and query answering

Proceedings of the twenty-fourth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Computing cores for data exchange: new algorithms and practical solutions

Proceedings of the twenty-fourth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
ConQuer: efficient management of inconsistent databases

Proceedings of the 2005 ACM SIGMOD international conference on Management of data
Clio grows up: from research prototype to industrial tool

Proceedings of the 2005 ACM SIGMOD international conference on Management of data
Data exchange: on the complexity of answering queries with inequalities

Information Processing Letters
Data exchange: semantics and query answering

Theoretical Computer Science - Database theory
Composing schema mappings: Second-order dependencies to the rescue

ACM Transactions on Database Systems (TODS) - Special Issue: SIGMOD/PODS 2004
Data exchange: computing cores in polynomial time

Proceedings of the twenty-fifth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Inverting schema mappings

Proceedings of the twenty-fifth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Data exchange and incomplete information

Proceedings of the twenty-fifth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Peer data exchange

ACM Transactions on Database Systems (TODS)
Composition of mappings given by embedded dependencies

ACM Transactions on Database Systems (TODS)
CWA-solutions for data exchange settings with target dependencies

Proceedings of the twenty-sixth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Quasi-inverses of schema mappings

Proceedings of the twenty-sixth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Data exchange in the presence of arithmetic comparisons

EDBT '08 Proceedings of the 11th international conference on Extending database technology: Advances in database technology

Aggregate queries over ontologies

Proceedings of the 2nd international workshop on Ontologies and information systems for the semantic web
Data exchange: query answering for incomplete data sources

Proceedings of the 3rd international conference on Scalable information systems
Logical foundations of relational data exchange

ACM SIGMOD Record
Normalization and optimization of schema mappings

Proceedings of the VLDB Endowment
Towards practical feasibility of core computation in data exchange

Theoretical Computer Science
Aggregate queries for discrete and continuous probabilistic XML

Proceedings of the 13th International Conference on Database Theory
Answering non-monotonic queries in relational data exchange

Proceedings of the 13th International Conference on Database Theory
On the tradeoff between mapping and querying power in XML data exchange

Proceedings of the 13th International Conference on Database Theory
Logic and data exchange: which solutions are "good" solutions?

LOFT'08 Proceedings of the 8th international conference on Logic and the foundations of game and decision theory
Managing lineage and uncertainty under a data exchange setting

SUM'10 Proceedings of the 4th international conference on Scalable uncertainty management
The complexity of evaluating tuple generating dependencies

Proceedings of the 14th International Conference on Database Theory
Data cleaning and query answering with matching dependencies and matching functions

Proceedings of the 14th International Conference on Database Theory
Closed world data exchange

ACM Transactions on Database Systems (TODS)
Normalization and optimization of schema mappings

The VLDB Journal — The International Journal on Very Large Data Bases
A sound and complete model-generation procedure for consistent and confidentiality-preserving databases

Theoretical Computer Science
Capturing continuous data and answering aggregate queries in probabilistic XML

ACM Transactions on Database Systems (TODS)
OLAP query reformulation in peer-to-peer data warehousing

Information Systems
Count constraints and the inverse OLAP problem: definition, complexity and a step toward aggregate data exchange

FoIKS'12 Proceedings of the 7th international conference on Foundations of Information and Knowledge Systems
The definability abduction problem for data exchange

RR'12 Proceedings of the 6th international conference on Web Reasoning and Rule Systems
Solutions and query rewriting in data exchange

Information and Computation
Getting unique solution in data exchange

Proceedings of the VLDB Endowment
Integrating domain heterogeneous data sources using decomposition aggregation queries

Information Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

Data exchange, also known as data translation, has been extensively investigated in recent years. One main direction of research has focused on the semantics and the complexity of answering first-order queries in the context of data exchange between relational schemas. In this paper, we initiate a systematic investigation of the semantics and the complexity of aggregate queries in data exchange, and make a number of conceptual and technical contributions. Data exchange is a context in which incomplete information arises, hence one has to cope with a set of possible worlds, instead of a single database. Three different sets of possible worlds have been explored in the study of the certain answers of first-order queries in data exchange: the set of possible worlds of all solutions, the set of possible worlds of all universal solutions, and a set of possible worlds derived from the CWA-solutions. We examine each of these sets and point out that none of them is suitable for aggregation in data exchange, as each gives rise to rather trivial semantics. Our analysis also reveals that, to have meaningful semantics for aggregation in data exchange, a strict closed world assumption has to be adopted in selecting the set of possible worlds. For this, we introduce and study the set of the endomorphic images of the canonical universal solution as a set of possible worlds for aggregation in data exchange. Our main technical result is that for schema mappings specified by source-to-target tgds, there are polynomial-time algorithms for computing the range semantics of every scalar aggregation query, where the range semantics of an aggregate query is the greatest lower bound and the least upper bound of the values that the query takes over the set of possible worlds. Among these algorithms, the more sophisticated one is the algorithm for the average operator, which makes use of concepts originally introduced in the study of the core of the universal solutions in data exchange. We also show that if, instead of range semantics, we consider possible answer semantics, then it is an NP-complete problem to tell if a number is a possible answer of a given scalar aggregation query with the average operator.