A Technique for Extracting Sub-source Similarities from Information Sources Having Different Formats

  • Authors:
  • Domenico Rosaci;Giorgio Terracina;Domenico Ursino

  • Affiliations:
  • DIMET, Università “Mediterranea” di Reggio Calabria, Via Graziella, Località Feo di Vito, 89060 Reggio Calabria, Italy;Dipartimento di Matematica, Università della Calabria, Via P. Bucci, 87036 Rende (CS), Italy;DIMET, Università “Mediterranea” di Reggio Calabria, Via Graziella, Località Feo di Vito, 89060 Reggio Calabria, Italy

  • Venue:
  • World Wide Web
  • Year:
  • 2003

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper we propose a semi-automatic technique for deriving the similarity degree between two portions of heterogeneous information sources (hereafter, sub-sources). The proposed technique consists in two phases: the first one selects the most promising pairs of sub-sources, whereas the second one computes the similarity degree relative to each promising pair. We show that the detection of sub-source similarities is a special case (and a very interesting one, for semi-structured information sources) of the more general problem of Scheme Match. In addition, we present a real example case to clarify the proposed technique, a set of experiments we have conducted to verify the quality of its results, a discussion about its computational complexity and its classification in the context of related literature. Finally, we discuss some possible applications which can benefit by derived similarities.