Transliteration by bidirectional statistical machine translation

  • Authors:
  • Andrew Finch;Eiichiro Sumita

  • Affiliations:
  • NICT, Keihanna Science City, Japan;NICT, Keihanna Science City, Japan

  • Venue:
  • NEWS '09 Proceedings of the 2009 Named Entities Workshop: Shared Task on Transliteration
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

The system presented in this paper uses phrase-based statistical machine translation (SMT) techniques to directly transliterate between all language pairs in this shared task. The technique makes no language specific assumptions, uses no dictionaries or explicit phonetic information. The translation process transforms sequences of tokens in the source language directly into to sequences of tokens in the target. All language pairs were transliterated by applying this technique in a single unified manner. The machine translation system used was a system comprised of two phrase-based SMT decoders. The first generated from the first token of the target to the last. The second system generated the target from last to first. Our results show that if only one of these decoding strategies is to be chosen, the optimal choice depends on the languages involved, and that in general a combination of the two approaches is able to outperform either approach.