Improving data quality through effective use of data semantics

  • Authors:
  • Stuart Madnick;Hongwei Zhu

  • Affiliations:
  • MIT Sloan School of Management, Information Technologies, Cambridge, MA;MIT Sloan School of Management, Information Technologies, Cambridge, MA

  • Venue:
  • Data & Knowledge Engineering - Special issue: WIDM 2004
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

Data quality issues have taken on increasing importance in recent years. In our research, we have discovered that many "data quality" problems are actually "data misinterpretation" problems--that is, problems caused by heterogeneous data semantics. In this paper, we first identify semantic heterogeneities that, when not resolved, often cause data quality problems. We discuss the especially challenging problem of aggregational ontological heterogeneity, which concerns how complex entities and their relationships are aggregated. Then we illustrate how COntext INterchange (COIN) technology can be used to capture data semantics and reconcile semantic heterogeneities, thereby improving data quality.