SCHEMA - an algorithm for automated product taxonomy mapping in e-commerce

  • Authors:
  • Steven S. Aanen;Lennart J. Nederstigt;Damir Vandić;Flavius Frăsincar

  • Affiliations:
  • Erasmus Universiteit Rotterdam, Rotterdam, The Netherlands;Erasmus Universiteit Rotterdam, Rotterdam, The Netherlands;Erasmus Universiteit Rotterdam, Rotterdam, The Netherlands;Erasmus Universiteit Rotterdam, Rotterdam, The Netherlands

  • Venue:
  • ESWC'12 Proceedings of the 9th international conference on The Semantic Web: research and applications
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper proposes SCHEMA, an algorithm for automated mapping between heterogeneous product taxonomies in the e-commerce domain. SCHEMA utilises word sense disambiguation techniques, based on the ideas from the algorithm proposed by Lesk, in combination with the semantic lexicon WordNet. For finding candidate map categories and determining the path-similarity we propose a node matching function that is based on the Levenshtein distance. The final mapping quality score is calculated using the Damerau-Levenshtein distance and a node-dissimilarity penalty. The performance of SCHEMA was tested on three real-life datasets and compared with PROMPT and the algorithm proposed by Park & Kim. It is shown that SCHEMA improves considerably on both recall and F $_{\textrm{1}}$-score, while maintaining similar precision.