Active learning of expressive linkage rules for the web of data

  • Authors:
  • Robert Isele;Anja Jentzsch;Christian Bizer

  • Affiliations:
  • Web-based Systems Group, Freie Universität Berlin, Berlin, Germany;Web-based Systems Group, Freie Universität Berlin, Berlin, Germany;Web-based Systems Group, Freie Universität Berlin, Berlin, Germany

  • Venue:
  • ICWE'12 Proceedings of the 12th international conference on Web Engineering
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

The amount of data that is available as Linked Data on the Web has grown rapidly over the last years. However, the linkage between data sources remains sparse as setting RDF links means effort for the data publishers. Many existing methods for generating these links rely on explicit linkage rules which specify the conditions which must hold true for two entities in order to be interlinked. As writing good linkage rules by hand is a non-trivial problem, the burden to generate links between data sources is still high. In order to reduce the effort and required expertise to write linkage rules, we present an approach which combines genetic programming and active learning for the interactive generation of expressive linkage rules. Our approach automates the generation of a linkage rule and only requires the user to confirm or decline a number of example links. The algorithm minimizes user involvement by selecting example links which yield a high information gain. The proposed approach has been implemented in the Silk Link Discovery Framework. Within our experiments, the algorithm was capable of finding linkage rules with a full F1-measure by asking the user to confirm or decline a maximum amount of 20 links.