A large dataset for the evaluation of ontology matching

  • Authors:
  • Fausto Giunchiglia;Mikalai Yatskevich;Paolo Avesani;Pavel Shivaiko

  • Affiliations:
  • Department of information engineering and computer science (disi), university of trento, 38050 povo, trento, italy;Department of information engineering and computer science (disi), university of trento, 38050 povo, trento, italy;Fondazione bruno kessler, via sommarive 18, 38050 povo, trento, italy;Department of information engineering and computer science (disi), university of trento, 38050 povo, trento, italy

  • Venue:
  • The Knowledge Engineering Review
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

Recently, the number of ontology matching techniques and systems has increased significantly. This makes the issue of their evaluation and comparison more severe. One of the challenges of the ontology matching evaluation is in building large-scale evaluation datasets. In fact, the number of possible correspondences between two ontologies grows quadratically with respect to the numbers of entities in these ontologies. This often makes the manual construction of the evaluation datasets demanding to the point of being infeasible for large-scale matching tasks. In this paper, we present an ontology matching evaluation dataset composed of thousands of matching tasks, called TaxME2. It was built semi-automatically out of the Google, Yahoo, and Looksmart web directories. We evaluated TaxME2 by exploiting the results of almost two-dozen of state-of-the-art ontology matching systems. The experiments indicate that the dataset possesses the desired key properties, namely it is error-free, incremental, discriminative, monotonic, and hard for the state-of-the-art ontology matching systems.