Learning schema mappings

Authors:
Balder Ten Cate;Víctor Dalmau;Phokion G. Kolaitis
Affiliations:
University of California, Santa Cruz, CA;Universitat Pompeu Fabra, Barcelona, Spain;University of California, Santa Cruz and IBM Research--Almaden
Venue:
ACM Transactions on Database Systems (TODS) - Invited papers issue
Year:
2013

Citing 49
Cited 0

A theory of the learnable

Communications of the ACM
Occam's razor

Information Processing Letters
Negative Results for Equivalence Queries

Machine Learning
The Strength of Weak Learnability

Machine Learning
Polynomial graph-colorings

Discrete Applied Mathematics
Computational learning theory: an introduction

Computational learning theory: an introduction
Learning Conjunctions of Horn Clauses

Machine Learning - Computational learning theory
Cryptographic limitations on learning Boolean formulae and finite automata

Journal of the ACM (JACM)
An introduction to computational learning theory

An introduction to computational learning theory
When won't membership queries help?

Selected papers of the 23rd annual ACM symposium on Theory of computing
Pac-learning non-recursive Prolog clauses

Artificial Intelligence
How many queries are needed to learn?

Journal of the ACM (JACM)
Conjunctive query containment revisited

Theoretical Computer Science - Special issue on the 6th International Conference on Database Theory—ICDT '97
Conjunctive-query containment and constraint satisfaction

Journal of Computer and System Sciences - Special issue on the seventeenth ACM SIGACT-SIGMOD-SIGART symposium on principles of database systems
Data-driven understanding and refinement of schema mappings

SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Learning to map between ontologies on the semantic web

Proceedings of the 11th international conference on World Wide Web
Data integration: a theoretical perspective

Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Learning Conjunctive Concepts in Structural Domains

Machine Learning
Queries and Concept Learning

Machine Learning
Queries and Concept Learning

Machine Learning
On the Hardness of Learning Acyclic Conjunctive Queries

ALT '00 Proceedings of the 11th International Conference on Algorithmic Learning Theory
A survey of approaches to automatic schema matching

The VLDB Journal — The International Journal on Very Large Data Bases
Optimal implementation of conjunctive queries in relational data bases

STOC '77 Proceedings of the ninth annual ACM symposium on Theory of computing
Learning to map between structured representations of data

Learning to map between structured representations of data
Schema mappings, data exchange, and metadata management

Proceedings of the twenty-fourth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Clio grows up: from research prototype to industrial tool

Proceedings of the 2005 ACM SIGMOD international conference on Management of data
HePToX: marrying XML and heterogeneity in your P2P databases

VLDB '05 Proceedings of the 31st international conference on Very large data bases
Data exchange: semantics and query answering

Theoretical Computer Science - Database theory
Database dependency discovery: a machine learning approach

AI Communications
The complexity of properly learning simple concept classes

Journal of Computer and System Sciences
Implementing mapping composition

The VLDB Journal — The International Journal on Very Large Data Bases
Hardness of approximate two-level logic minimization and PAC learning with membership queries

Journal of Computer and System Sciences
Muse: Mapping Understanding and deSign by Example

ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
Discovering Conditional Functional Dependencies

ICDE '09 Proceedings of the 2009 IEEE International Conference on Data Engineering
On the Expressive Power of the Relational Algebra on Finite Sets of Relation Pairs

IEEE Transactions on Knowledge and Data Engineering
Logical foundations of relational data exchange

ACM SIGMOD Record
Query by output

Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
Pac-learning recursive logic programs: efficient algorithms

Journal of Artificial Intelligence Research
Pac-learning recursive logic programs: negative results

Journal of Artificial Intelligence Research
Learning disjunction of conjunctions

IJCAI'85 Proceedings of the 9th international joint conference on Artificial intelligence - Volume 1
Schema mapping discovery from data instances

Journal of the ACM (JACM)
Towards a General Framework for Effective Solutions to the Data Mapping Problem

Journal on Data Semantics XIV
Synthesizing view definitions from data

Proceedings of the 13th International Conference on Database Theory
Characterizing schema mappings via data examples

Proceedings of the twenty-ninth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Database constraints and homomorphism dualities

CP'10 Proceedings of the 16th international conference on Principles and practice of constraint programming
Induction of relational algebra expressions

ILP'09 Proceedings of the 19th international conference on Inductive logic programming
Designing and refining schema mappings via data examples

Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
Learning twig and path queries

Proceedings of the 15th International Conference on Database Theory
Learning schema mappings

Proceedings of the 15th International Conference on Database Theory

Quantified Score

Hi-index	0.00

Visualization

Abstract

A schema mapping is a high-level specification of the relationship between a source schema and a target schema. Recently, a line of research has emerged that aims at deriving schema mappings automatically or semi-automatically with the help of data examples, that is, pairs consisting of a source instance and a target instance that depict, in some precise sense, the intended behavior of the schema mapping. Several different uses of data examples for deriving, refining, or illustrating a schema mapping have already been proposed and studied. In this article, we use the lens of computational learning theory to systematically investigate the problem of obtaining algorithmically a schema mapping from data examples. Our aim is to leverage the rich body of work on learning theory in order to develop a framework for exploring the power and the limitations of the various algorithmic methods for obtaining schema mappings from data examples. We focus on GAV schema mappings, that is, schema mappings specified by GAV (Global-As-View) constraints. GAV constraints are the most basic and the most widely supported language for specifying schema mappings. We present an efficient algorithm for learning GAV schema mappings using Angluin's model of exact learning with membership and equivalence queries. This is optimal, since we show that neither membership queries nor equivalence queries suffice, unless the source schema consists of unary relations only. We also obtain results concerning the learnability of schema mappings in the context of Valiant's well-known PAC (Probably-Approximately-Correct) learning model, and concerning the learnability of restricted classes of GAV schema mappings. Finally, as a byproduct of our work, we show that there is no efficient algorithm for approximating the shortest GAV schema mapping fitting a given set of examples, unless the source schema consists of unary relations only.