Improve syntax-based translation using deep syntactic structures

  • Authors:
  • Xianchao Wu;Takuya Matsuzaki;Jun'Ichi Tsujii

  • Affiliations:
  • Department of Computer Science, The University of Tokyo, Tokyo, Japan;Department of Computer Science, The University of Tokyo, Tokyo, Japan;Department of Computer Science, The University of Tokyo, Tokyo, Japan and School of Computer Science, University of Manchester, Manchester, UK and National Centre for Text Mining, Manchester, UK

  • Venue:
  • Machine Translation
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper introduces deep syntactic structures to syntax-based Statistical Machine Translation (SMT). We use a Head-driven Phrase Structure Grammar (HPSG) parser to obtain the deep syntactic structures of a sentence, which include not only a fine-grained syntactic property description but also a semantic representation. Considering the abundant information included in the deep syntactic structures, it is interesting to investigate whether or not they improve the traditional syntax-based translation models based on PCFG parsers. In order to use deep syntactic structures for SMT, this paper focuses on extracting tree-to-string translation rules from aligned HPSG tree---string pairs. The major challenge is to properly localize the non-local relations among nodes in an HPSG tree. To localize the semantic dependencies among words and phrases, which can be inherently non-local, a minimum covering tree is defined by taking a predicate word and its lexical/phrasal arguments as the frontier nodes. Starting from this definition, a linear-time algorithm is proposed to extract translation rules through one-time traversal of the leaf nodes in an HPSG tree. Extensive experiments on a tree-to-string translation system testified the effectiveness of our proposal.