Why Your Data Won't Mix

  • Authors:
  • Alon Halevy

  • Affiliations:
  • University of Washington

  • Venue:
  • Queue - Semi-structured Data
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

When independent parties develop database schemas for the same domain, they will almost always be quite different from each other. These differences are referred to as semantic heterogeneity, which also appears in the presence of multiple XML documents, Web services, and ontologies—or more broadly, whenever there is more than one way to structure a body of data. The presence of semi-structured data exacerbates semantic heterogeneity, because semi-structured schemas are much more flexible to start with. For multiple data systems to cooperate with each other, they must understand each other’s schemas. Without such understanding, the multitude of data sources amounts to a digital version of the Tower of Babel.