Bilingual Sentence Alignment: Balancing Robustness and Accuracy

  • Authors:
  • Machine Translation staff

  • Affiliations:
  • -

  • Venue:
  • Machine Translation
  • Year:
  • 1998

Quantified Score

Hi-index 0.00

Visualization

Abstract

Sentence alignment is the problem of making explicit the relationsthat exist between the sentences of two texts that are known to be mutualtranslations. Automatic sentence-alignment methods typically face two kindsof difficulties. First, there is the question of robustness. In real life,discrepancies between a source text and its translation are quite common:differences in layout, omissions, inversions, etc. Sentence-alignmentprograms must be ready to deal with such phenomena. Then, there is thequestion of accuracy. Even when translations are ’’clean‘‘, alignment isstill not a trivial matter: some decisions are hard to make, even forhumans. We report here on the current state of our ongoing efforts toproduce a sentence-alignment program that is both robust and accurate. Themethod that we propose relies on two new alignment engines: one thatproduces highly reliable and robust character-level alignments, and one thatrelies on statistical lexical knowledge to produce accurate mappings.Experimental results are presented which demonstrate the method‘seffectiveness, and highlight where problems remain to be solved.