Generalizing local translation models

  • Authors:
  • Michael Subotin

  • Affiliations:
  • University of Maryland, College Park, MD

  • Venue:
  • SSST '08 Proceedings of the Second Workshop on Syntax and Structure in Statistical Translation
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

We investigate translation modeling based on exponential estimates which generalize essential components of standard translation models. In application to a hierarchical phrase-based system the simplest generalization allows its models of lexical selection and reordering to be conditioned on arbitrary attributes of the source sentence and its annotation. Viewing these estimates as approximations of sentence-level probabilities motivates further elaborations that seek to exploit general syntactic and morphological patterns. Dimensionality control with l1 regularizers makes it possible to negotiate the tradeoff between translation quality and decoding speed. Putting together and extending several recent advances in phrase-based translation we arrive at a flexible modeling framework that allows efficient leveraging of monolingual resources and tools. Experiments with features derived from the output of Chinese and Arabic parsers and an Arabic lemmatizer show significant improvements over a strong baseline.