A knowledge-rich approach to measuring the similarity between Bulgarian and Russian words

  • Authors:
  • Svetlin Nakov;Elena Paskaleva;Preslav Nakov

  • Affiliations:
  • Sofia University "St. Kliment Ohridski", Sofia, Bulgaria;Bulgarian Academy of Sciences, Sofia, Bulgaria;National University of Singapore, Singapore

  • Venue:
  • MRTECEEL '09 Proceedings of the Workshop on Multilingual Resources, Technologies and Evaluation for Central and Eastern European Languages
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

We propose a novel knowledge-rich approach to measuring the similarity between a pair of words. The algorithm is tailored to Bulgarian and Russian and takes into account the orthographic and the phonetic correspondences between the two Slavic languages: it combines lemmatization, hand-crafted transformation rules, and weighted Levenshtein distance. The experimental results show an 11-pt interpolated average precision of 90.58%, which represents a sizeable improvement over two classic rivaling approaches.