Web Service aggregation with string distance ensembles and active probe selection

  • Authors:
  • Eddie Johnston;Nicholas Kushmerick

  • Affiliations:
  • School of Computer Science and Informatics, University College Dublin, Belfield, Dublin, Ireland;School of Computer Science and Informatics, University College Dublin, Belfield, Dublin, Ireland

  • Venue:
  • Information Fusion
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

The adoption of standards for exchanging information across the Web presents both new opportunities and important challenges for data integration and aggregation. Although Web Services simplify the discovery and access of information sources, the problem of semantic heterogeneity remains: how to find semantic correspondences across the data being integrated. In this paper, we explore these issues in the context of Web Services, and propose OATS, a novel algorithm for schema matching that is specifically suited to Web Service data aggregation. We show how probing Web Services with a small set of related queries results in semantically correlated data instances which greatly simplifies the matching process, and demonstrate that the use of an ensemble of string distance metrics in matching data instances performs better than individual metrics. We also show how the choice of probe queries has a dramatic effect on matching accuracy. Motivated by this observation, we describe and evaluate an machine learning approach to selecting probes to maximise accuracy while minimising cost.