Multi-Alignment Templates Induction

  • Authors:
  • Algirdas Laukaitis;Olegas Vasilecas

  • Affiliations:
  • Faculty of Fundamental Sciences, Vilnius Gediminas Technical University, Saulėtekio al. 11, LT-10223 Vilnius, Lithuania, e-mail: algirdas.laukaitis@fm.vgtu.lt, olegas.vasilecas@fm.vgtu.lt;Faculty of Fundamental Sciences, Vilnius Gediminas Technical University, Saulėtekio al. 11, LT-10223 Vilnius, Lithuania, e-mail: algirdas.laukaitis@fm.vgtu.lt, olegas.vasilecas@fm.vgtu.lt

  • Venue:
  • Informatica
  • Year:
  • 2008

Quantified Score

Hi-index 0.02

Visualization

Abstract

This paper examins approaches for translation between English and morphology-rich languages. Experiment with English-Russian and English-Lithuanian revels that “pure” statistical approaches on 10 million word corpus gives unsatisfactory translation. Then, several Web-available linguistic resources are suggested for translation. Syntax parsers, bilingual and semantic dictionaries, bilingual parallel corpus and monolingualWeb-based corpus are integrated in one comprehensive statistical model. Multi-abstraction language representation is used for statistical induction of syntactic and semantic transformation rules called multi-alignment templates. The decodingmodel is described using the feature functions, a log-linear modeling approach and A* search algorithm. An evaluation of this approach is performed on the English-Lithuanian language pair. Presented experimental results demonstrates that the multi-abstraction approach and hybridization of learning methods can improve quality of translation.