Sample-driven schema mapping

Authors:
Li Qian;Michael J. Cafarella;H. V. Jagadish
Affiliations:
University of Michigan, Ann Arbor, USA;University of Michigan, Ann Arbor, USA;University of Michigan, Ann Arbor, USA
Venue:
SIGMOD '12 Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data
Year:
2012

Citing 28
Cited 1

Data-driven understanding and refinement of schema mappings

SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Reconciling schemas of disparate data sources: a machine-learning approach

SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Data integration: a theoretical perspective

Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Generic Schema Matching with Cupid

Proceedings of the 27th International Conference on Very Large Data Bases
On schema matching with opaque column names and data values

Proceedings of the 2003 ACM SIGMOD international conference on Management of data
DBXplorer: A System for Keyword-Based Search over Relational Databases

ICDE '02 Proceedings of the 18th International Conference on Data Engineering
Similarity Flooding: A Versatile Graph Matching Algorithm and Its Application to Schema Matching

ICDE '02 Proceedings of the 18th International Conference on Data Engineering
Keyword Searching and Browsing in Databases using BANKS

ICDE '02 Proceedings of the 18th International Conference on Data Engineering
Corpus-Based Schema Matching

ICDE '05 Proceedings of the 21st International Conference on Data Engineering
Schema mappings, data exchange, and metadata management

Proceedings of the twenty-fourth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Bidirectional expansion for keyword search on graph databases

VLDB '05 Proceedings of the 31st international conference on Very large data bases
SPIDER: a schema mapPIng DEbuggeR

VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Making database systems usable

Proceedings of the 2007 ACM SIGMOD international conference on Management of data
Translating web data

VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
COMA: a system for flexible combination of schema matching approaches

VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Discover: keyword search in relational databases

VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Quickmig: automatic schema matching for data migration projects

Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
Learning to create data-integrating queries

Proceedings of the VLDB Endowment
Query by example

AFIPS '75 Proceedings of the May 19-22, 1975, national computer conference and exposition
Muse: Mapping Understanding and deSign by Example

ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
Usage-Based Schema Matching

ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
Logical foundations of relational data exchange

ACM SIGMOD Record
HAMSTER: using search clicklogs for schema and taxonomy matching

Proceedings of the VLDB Endowment
Data integration for the relational web

Proceedings of the VLDB Endowment
Characterizing schema mappings via data examples

Proceedings of the twenty-ninth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Automatically incorporating new sources in keyword search-based data integration

Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
Schema Matching and Mapping

Schema Matching and Mapping
Designing and refining schema mappings via data examples

Proceedings of the 2011 ACM SIGMOD International Conference on Management of data

Reverse engineering complex join queries

Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data

Quantified Score

Hi-index	0.00

Visualization

Abstract

End-users increasingly find the need to perform light-weight, customized schema mapping. State-of-the-art tools provide powerful functions to generate schema mappings, but they usually require an in-depth understanding of the semantics of multiple schemas and their correspondences, and are thus not suitable for users who are technically unsophisticated or when a large number of mappings must be performed. We propose a system for sample-driven schema mapping. It automatically constructs schema mappings, in real time, from user-input sample target instances. Because the user does not have to provide any explicit attribute-level match information, she is isolated from the possibly complex structure and semantics of both the source schemas and the mappings. In addition, the user never has to master any operations specific to schema mappings: she simply types data values into a spreadsheet-style interface. As a result, the user can construct mappings with a much lower cognitive burden. In this paper we present Mweaver, a prototype sample-driven schema mapping system. It employs novel algorithms that enable the system to obtain desired mapping results while meeting interactive response performance requirements. We show the results of a user study that compares Mweaver with two state-of-the-art mapping tools across several mapping tasks, both real and synthetic. These suggest that the Mweaver system enables users to perform practical mapping tasks in about 1/5th the time needed by the state-of-the-art tools.