Multi-word unit dependency forest-based translation rule extraction

  • Authors:
  • Hwidong Na;Jong-Hyeok Lee

  • Affiliations:
  • Pohang University of Science and Technology (POSTECH), Pohang, Republic of Korea;Pohang University of Science and Technology (POSTECH), Pohang, Republic of Korea

  • Venue:
  • SSST-5 Proceedings of the Fifth Workshop on Syntax, Semantics and Structure in Statistical Translation
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Translation requires non-isomorphic transformation from the source to the target. However, non-isomorphism can be reduced by learning multi-word units (MWUs). We present a novel way of representating sentence structure based on MWUs, which are not necessarily continuous word sequences. Our proposed method builds a simpler structure of MWUs than words using words as vertices of a dependency structure. Unlike previous studies, we collect many alternative structures in a packed forest. As an application of our proposed method, we extract translation rules in form of a source MWU-forest to the target string, and verify the rule coverage empirically. As a consequence, we improve the rule coverage compare to a previous work, while retaining the linear asymptotic complexity.