Why Your Data Won't Mix

Authors:
Alon Halevy
Affiliations:
University of Washington
Venue:
Queue - Semi-structured Data
Year:
2005

Citing 11
Cited 17

Data-driven understanding and refinement of schema mappings

SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Reconciling schemas of disparate data sources: a machine-learning approach

SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Generic Schema Matching with Cupid

Proceedings of the 27th International Conference on Very Large Data Bases
A survey of approaches to automatic schema matching

The VLDB Journal — The International Journal on Very Large Data Bases
Learning about data integration challenges from day one

ACM SIGMOD Record
Corpus-Based Schema Matching

ICDE '05 Proceedings of the 21st International Conference on Data Engineering
Visualization of mappings between schemas

Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Enterprise information integration: successes, challenges and controversies

Proceedings of the 2005 ACM SIGMOD international conference on Management of data
From databases to dataspaces: a new abstraction for information management

ACM SIGMOD Record
COMA: a system for flexible combination of schema matching approaches

VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Similarity search for web services

VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30

A Bayesian approach to diagram matching with application to architectural models

Proceedings of the 28th international conference on Software engineering
Data integration in mashups

ACM SIGMOD Record
Intelligent dataspaces for e-science

CIMMACS'08 Proceedings of the 7th WSEAS international conference on Computational intelligence, man-machine systems and cybernetics
Ontology-Based Method for Schema Matching in a Peer-to-Peer Database System

BNCOD 26 Proceedings of the 26th British National Conference on Databases: Dataspace: The Final Frontier
An empirical study on using hidden markov model for search interface segmentation

Proceedings of the 18th ACM conference on Information and knowledge management
Clustering deep web databases semantically

AIRS'08 Proceedings of the 4th Asia information retrieval conference on Information retrieval technology
Understanding deep web search interfaces: a survey

ACM SIGMOD Record
A process model discovery approach for enabling model interoperability in signal engineering

Proceedings of the First International Workshop on Model-Driven Interoperability
Save up to 99% of your time in mapping validation

OTM'10 Proceedings of the 2010 international conference on On the move to meaningful internet systems: Part II
An ontology-based methodology for supporting knowledge-intensive multi-discipline engineering processes

Ontology-Driven Software Engineering
RDF containers: a framework for the integration of distributed and heterogeneous applications

OTM'10 Proceedings of the 2010 international conference on On the move to meaningful internet systems
TODWEB: training-less ontology based deep web source classification

Proceedings of the 13th International Conference on Information Integration and Web-based Applications and Services
Exploiting semantic structure for mapping user-specified form terms to SNOMED CT concepts

Proceedings of the 2nd ACM SIGHIT International Health Informatics Symposium
Information extraction, real-time processing and DW2.0 in operational business intelligence

DNIS'10 Proceedings of the 6th international conference on Databases in Networked Information Systems
Systems integration challenges for supporting cross context collaborative pedagogical scenarios

CRIWG'12 Proceedings of the 18th international conference on Collaboration and Technology
Discovering meaning on the go in large heterogenous data

Artificial Intelligence Review
A Context-Based Approach to Reconciling Data Interpretation Conflicts in Web Services Composition

ACM Transactions on Internet Technology (TOIT)

Quantified Score

Hi-index	0.00

Visualization

Abstract

When independent parties develop database schemas for the same domain, they will almost always be quite different from each other. These differences are referred to as semantic heterogeneity, which also appears in the presence of multiple XML documents, Web services, and ontologies—or more broadly, whenever there is more than one way to structure a body of data. The presence of semi-structured data exacerbates semantic heterogeneity, because semi-structured schemas are much more flexible to start with. For multiple data systems to cooperate with each other, they must understand each other’s schemas. Without such understanding, the multitude of data sources amounts to a digital version of the Tower of Babel.