Entity Matching in Heterogeneous Databases: A Distance Based Decision Model

  • Authors:
  • Debabrata Dey;Sumit Sarkar;Prabuddha De

  • Affiliations:
  • -;-;-

  • Venue:
  • HICSS '98 Proceedings of the Thirty-First Annual Hawaii International Conference on System Sciences-Volume 7 - Volume 7
  • Year:
  • 1998

Quantified Score

Hi-index 0.00

Visualization

Abstract

The need to leverage the information contained in heterogeneous data sources has been widely documented in recent years. In order to accomplish this goal, an organization must resolve several types of heterogeneity problems that may exist across different data sources. We investigate one such problem called the entity heterogeneity problem. This problem arises when the same real-world entity type is represented using different identifiers in different applications. We propose a decision-theoretic model to resolve the problem. Our model uses a distance-based measure to express the similarity between two entity instances. We have implemented the model, and our experimental results indicate that this is a viable approach in real-world situations.