A first approach to CLIR using character n-grams alignment

  • Authors:
  • Jesús Vilares;Michael P. Oakes;John I. Tait

  • Affiliations:
  • Departamento de Computación, Universidade da Coruña, A Coruña, Spain;School of Computing and Technology, University of Sunderland, Sunderland, United Kingdom;School of Computing and Technology, University of Sunderland, Sunderland, United Kingdom

  • Venue:
  • CLEF'06 Proceedings of the 7th international conference on Cross-Language Evaluation Forum: evaluation of multilingual and multi-modal information retrieval
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper describes the technique for translation of character n-grams we developed for our participation in CLEF 2006. This solution avoids the need for word normalization during indexing or translation, and it can also deal with out-of-vocabulary words. Since it does not rely on language-specific processing, it can be applied to very different languages, even when linguistic information and resources are scarce or unavailable. Our proposal makes considerable use of freely available resources and also tries to achieve a higher speed during the n-gram alignment process with respect to other similar approaches.