Learning approach for domain-independent linked data instance matching

  • Authors:
  • Khai Nguyen;Ryutaro Ichise;Hoai-Bac Le

  • Affiliations:
  • University of Science, Ho Chi Minh, Vietnam;National Institute of Informatics, Tokyo, Japan;University of Science, Ho Chi Minh, Vietnam

  • Venue:
  • Proceedings of the ACM SIGKDD Workshop on Mining Data Semantics
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Because almost all linked data sources are currently published by different providers, interlinking homogeneous instances of these sources is an important problem in data integration. Recently, instance matching has been used to identify owl: sameAs links between linked datasets. Previous approaches primarily use predefined maps of corresponding attributes and most of them are limited to matching in specific domains. In this paper, we propose the LFM, a learning-based instant matching system, which is designed for achieving a reliable domain-independent matcher. First, we compute the similarity vectors between labeled pairs of instances without specifying the meaning of the RDF predicates. Then a learning process is applied to learn a tree classifier for predicting whether the new pairs of instances are identical. Experiments demonstrate that our method achieves a 4% improvement in precision and recall against recent top-ranked matchers, if we use a small amount of labeled data for learning.