Review: an overview of the phrase-based statistical machine translation techniques

  • Authors:
  • Marta Ruiz Costa-jussà/

  • Affiliations:
  • Barcelona media innovation center, avenida diagonal 177, 9th floor, 08018 barcelona, spain/ e-mail: marta.ruiz@barcelonamedia.org

  • Venue:
  • The Knowledge Engineering Review
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

This work provides a general overview of the statistical machine translation (SMT) scientific field, which is a subfield of machine translation (MT). Specifically, this paper focuses on one of the most popular SMT approaches, that is, the phrase-based system. The phrase-based translation units are typically extracted using statistical criteria, and they are weighted using different models. These models are log-linearly combined in the decoding, which is in charge of choosing the most probable translation. Significant quality improvements have been produced from original phrase-based SMT systems. Among others, the main challenges are reordering, domain adaptation and evaluation.