Comparison, selection and use of sentence alignment algorithms for new language pairs

  • Authors:
  • Anil Kumar Singh;Samar Husain

  • Affiliations:
  • LTRC, IIIT, Gachibowli, Hyderabad, India;LTRC, IIIT, Gachibowli, Hyderabad, India

  • Venue:
  • ParaText '05 Proceedings of the ACL Workshop on Building and Using Parallel Texts
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

Several algorithms are available for sentence alignment, but there is a lack of systematic evaluation and comparison of these algorithms under different conditions. In most cases, the factors which can significantly affect the performance of a sentence alignment algorithm have not been considered while evaluating. We have used a method for evaluation that can give a better estimate about a sentence alignment algorithm's performance, so that the best one can be selected. We have compared four approaches using this method. These have mostly been tried on European language pairs. We have evaluated manually-checked and validated English-Hindi aligned parallel corpora under different conditions. We also suggest some guidelines on actual alignment.