Translating cross-lingual spelling variants using transformation rules

  • Authors:
  • Jarmo Toivonen;Ari Pirkola;Heikki Keskustalo;Kari Visala;Kalervo Järvelin

  • Affiliations:
  • Institute of Signal Processing, Tampere University of Technology, P.O. Box 553, FIN-33101 Tampere, Finland;Department of Information Studies, University of Tampere, P.O. Box 607, FIN-33101 Tampere, Finland;Department of Information Studies, University of Tampere, P.O. Box 607, FIN-33101 Tampere, Finland;Department of Information Studies, University of Tampere, P.O. Box 607, FIN-33101 Tampere, Finland;Department of Information Studies, University of Tampere, P.O. Box 607, FIN-33101 Tampere, Finland

  • Venue:
  • Information Processing and Management: an International Journal
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

Technical terms and proper names constitute a major problem in dictionary-based cross-language information retrieval (CLIR). However, technical terms and proper names in different languages often share the same Latin or Greek origin, being thus spelling variants of each other. In this paper we present a novel two-step fuzzy translation technique for cross-lingual spelling variants. In the first step, transformation rules are applied to source words to render them more similar to their target language equivalents. The rules are generated automatically using translation dictionaries as source data. In the second step, the intermediate forms obtained in the first step are translated into a target language using fuzzy matching. The effectiveness of the technique was evaluated empirically using five source languages and English as a target language. The two-step technique performed better, in some cases considerably better, than fuzzy matching alone. Even using the first step as such showed promising results.