Optimization of word alignment clues

  • Authors:
  • Jörg Tiedemann

  • Affiliations:
  • Alfa-Informatica, University of Groningen, Groningen, The Netherlands e-mail: tiedeman@let.rug.nl

  • Venue:
  • Natural Language Engineering
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

Statistical, linguistic, and heuristic clues can be used for the alignment of words and multi-word units in parallel texts. This article describes the clue alignment approach and the optimization of its parameters using a genetic algorithm. Word alignment clues can come from various sources such as statistical alignment models, co-occurrence tests, string similarity scores and static dictionaries. A genetic algorithm implementing an evolutionary procedure can be used to optimize the parameters necessary for combining available clues. Experiments on English/Swedish bitext show a significant improvement of about 6% in F-scores compared to the baseline produced by statistical word alignment.